Reddit to update web standard to block automated website scraping


FILE PHOTO: Reddit's logo is displayed, at the New York Stock Exchange (NYSE) in New York City, U.S., March 21, 2024. REUTERS/Brendan McDermid/File Photo

(Reuters) - Social media platform Reddit said on Tuesday it will update a web standard used by the platform to block automated data scraping from its website, following reports that AI startups were bypassing the rule to gather content for their systems.

The move comes at a time when artificial intelligence firms have been accused of plagiarizing content from publishers to create AI-generated summaries without giving credit or asking for permission.

Reddit said that it would update the Robots Exclusion Protocol, or "robots.txt," a widely accepted standard meant to determine which parts of a site are allowed to be crawled.

The company also said it will maintain rate-limiting, a technique used to control the number of requests from one particular entity, and will block unknown bots and crawlers from data scraping - collecting and saving raw information - on its website.

More recently, robots.txt has become a key tool that publishers employ to prevent tech companies from using their content free-of-charge to train AI algorithms and create summaries in response to some search queries.

Last week, a letter to publishers by the content licensing startup TollBit said that several AI firms were circumventing the web standard to scrape publisher sites.

This follows a Wired investigation which found that AI search startup Perplexity likely bypassed efforts to block its web crawler via robots.txt.

Earlier in June, business media publisher Forbes accused Perplexity of plagiarizing its investigative stories for use in generative AI systems without giving credit.

Reddit said on Tuesday that researchers and organizations such as the Internet Archive will continue to have access to its content for non-commercial use.

(Reporting by Harshita Mary Varghese; Editing by Alan Barona)

Follow us on our official WhatsApp channel for breaking news alerts and key updates!

   

Next In Tech News

Foxconn third-quarter revenue jumps 20% year-on-year
Saudi Arabia's PIF mulls larger stake in Nintendo, Kyodo reports
Game on: Automakers expand video entertainment options in vehicles
Does it sound too good to be true? Here’s how to spot, avoid online marketing scams
Man who bragged about US$2.6mil jewellery heist on Instagram pleads guilty
Elon Musk’s friendship with Diddy examined after he posted about boy’s alleged abuse
Exclusive-Conservative think tank targeting NASA employees' communications about Musk, Trump
Factbox-AI startups ride on investor frenzy to raise billions in 2024
Google tests verified check marks in search results
Brazil's top court says X paid pending fines to wrong bank

Others Also Read