Reddit reportedly signs $60 million annual training data deal with Google


Updated on February 22, 2024:

The AI company licensing Reddit data is apparently Google. This was reported by Reuters, citing anonymous sources. Reuters confirms the license fee of 60 million dollars per year, although it is unclear to what extent and what data Reddit will provide in return.

Original article from February 17, 2024:

Reddit has signed a $60 million annual contract with an unnamed AI company to use the platform’s content to train its AI models.



According to Bloomberg, Reddit disclosed this in advance to potential investors, who are expected to support its planned IPO with a valuation of at least five billion US dollars. The deal shows how Reddit can capitalize on the current interest in AI training data.

Other social media platforms could also sell their user content in this way and generate additional revenue. Meta and X use their social media data to train their own AI models.

Many assume that Reddit plays a central role in the training of large language models such as OpenAI’s GPT-3.5 or GPT-4, Meta’s LLaMa, or Google’s models.

This is because many Reddit posts already carry a human rating thanks to the platform’s upvote and downvote function, which facilitates pre-sorting. The posts also contain additional contextual links. Both of these factors make the data valuable to AI companies.

“The Reddit corpus of data is really valuable. But we don’t need to give all of that value to some of the largest companies in the world for free,” said Reddit co-founder Steve Huffman in the spring of 2023.


Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top