Reddit is going to start charging large companies for access to its data, and stop large tech companies from hoovering up its user content to train chatbots.
On Tuesday, the social media site shared that it would launch a “new premium access point” to its application programming interface (API) for those who “require additional capabilities, higher usage limits, and broader usage rights.” An API governs how two different programs work with each other.
“We’re working to build a more sustainable, healthy ecosystem around data on Reddit,” the company said in its announcement. The company did not share pricing details, but said the rules would go into effect on June 19.
The company also didn’t go into detail into why it’s making the change, but Reddit CEO Steve Huffman suggested to the New York Times that A.I. is to blame.
Both Google and OpenAI have previously said they’ve used Reddit data to train their large language models, which underpin Google’s Bard and OpenAI’s ChatGPT.
Reddit CEO Steve Huffman suggested that “authentic conversation” on Reddit makes its data valuable to these models. “There’s a lot of stuff on the site that you’d only ever say in therapy, or A.A., or never at all,” he told the New York Times.
But “we don’t need to give all of that value to some of the largest companies in the world for free,” he said.
Ending free access?
Still, Reddit’s introduction of a paid tier of access to its systems may be a big change for developers used to accessing the site’s data for free.
It follows a similar decision from Twitter CEO Elon Musk, who started to charge for access to the company’s application programming interface earlier this year, with monthly access fees stretching into the tens, if not hundreds, of thousands of dollars.
Musk said the move was needed to end automated spam on the platform. Yet the API change also hinders accounts that post automated updates, like those sharing important news on natural disasters or extreme weather.
Users commenting on Reddit’s announcement expressed concern about what the changes would mean for third-party applications that display the website’s content. Reddit posted more details on its rules changes to its official subreddit—the term the company uses for communities—including a limit on access to “mature content” posted to the platform.
The developer of one third-party reader for Reddit content, citing calls with company staff, suggested that free API access for such programs would end. He continued that staff said the changes were needed due to the cost of server access and “the opportunity costs of users not using the official app.” (Fortune has reached out to the developer in question)
Huffman claimed to the New York Times that Reddit would still give free access to developers working to improve the experience on the website, such as by building an automated program to assist with moderation. Researchers using Reddit data would also get free access, he said.
Reddit did not immediately respond to a request for comment made outside of U.S. business hours.
Users have flocked to Reddit’s content, with some suggesting that its communities provided better answers to user questions than an ordinary Google search. Part of the appeal of new chatbots like ChatGPT and Bing A.I. is their ability to provide natural-sounding answers to user queries, like “What are the best restaurants in Mexico City?”—hallucinations, or entirely made-up answers, notwithstanding.
Alphabet, Google’s parent company, sees A.I. as a significant threat to its search business, and is reportedly working to integrate A.I. into its existing search engine and develop an entirely A.I.-driven search product.
Yet using user-generated content to train A.I. models can be controversial, especially given that creators rarely give explicit permission for their work to be used in this way. Artists in particular complain that A.I. programs like image generator Stable Diffusion scrape their artwork, then allow users to generate new pictures using their artistic style.
Reddit and its CEO may now want to find a way to monetize the conversations on Reddit, especially as the company reportedly prepares for an IPO in the second half of the year.
“Crawling Reddit, generating value and not returning any of that value to our users is something we have a problem with,” Huffman told the New York Times.