Reddit is reportedly in the middle of a licensing deal worth "about $60 million on an annualized basis," which could potentially allow an anonymous and large AI company to train its models using content from subreddits

(Image credit: Cole Martin)

What you need to know

An emerging report indicates that Reddit is in the middle of a megadeal with an unnamed large AI company worth $60 million yearly.
The deal could potentially allow the company to use the content from Reddit to train its AI models.
More details about the deal, the company's identity, and what it intends to do with the content it gets from Reddit remain unclear.

Generative AI is a hot topic in the technology landscape as more companies continue to warm up to it and integrate it into their workflows. In the past year and change, we've seen people using AI to unlock new heights and tap into new opportunities across education, medicine, computing, and more.

While this is impressive, there's a growing concern revolving around the safety and privacy measures in place to prevent AI from spiraling out of control. Adding to this, there's also the issue of companies like Microsoft and OpenAI ~~stealing~~ using copyrighted information to train their models.

At the beginning of this year, Microsoft and OpenAI were slapped with lawsuits over intellectual property theft by The New York Times and two non-fictional authors. The companies argue that copyright law doesn't forbid the use of copyrighted material to train AI models.

Interestingly, OpenAI CEO Sam Altman admitted that it's virtually impossible to create ChatGPT-like tools without using copyright material, further indicating that restricting the training of these tools to copyright-free material would create AI chatbots that cannot meet the average user's minimum requirements.

The copyright restrictions explain the increase in reports of chatbots like OpenAI's ChatGPT getting dumber and Microsoft Copilot's (formerly Bing Chat) decline in accuracy. As it happens, a large AI company is reportedly in the middle of a new licensing deal with Reddit worth "about $60 million on an annualized basis," according to a spot by Bloomberg.

The deal could potentially lead to the unnamed company using Reddit posts to train its AI models, though details revolving around this deal remain slim and left to speculation. Reddit harbors credible information across its subreddits coupled with comments and interactions from avid users. As such, it's a gold mine and credible tool that can be leveraged to further enhance and improve the capabilities of LLMs.

Compensation at last, but at what cost?

Sync For Reddit Surface Duo — (Image credit: Future)

For what seems like an eternity, companies like Microsoft and OpenAI simply lift information from websites and package it as their own in bite-size form, with little regard for referencing to the source or even compensation.

It's only last year in December when it was reported that OpenAI was in the middle of a megadeal with German publisher Axel Springer, which will see it part tens of millions of euros in 3 years (A first, if you ask me). In return, the tech company will have access to articles (archived and current) from the publisher to train its AI models.

However, it remains uncertain what kind of reception this will get. Reddit has had its own fair share of issues and challenges over the past few years. You might remember last year's fiasco when the company announced its plans to start charging for access to its APIs. A move that led to thousands of forums being shut down in protest of the move, which consequently led to the crashing of the site.

TOPICS

Kevin Okemwa is a seasoned tech journalist based in Nairobi, Kenya with lots of experience covering the latest trends and developments in the industry at Windows Central. With a passion for innovation and a keen eye for detail, he has written for leading publications such as OnMSFT, MakeUseOf, and Windows Report, providing insightful analysis and breaking news on everything revolving around the Microsoft ecosystem. While AFK and not busy following the ever-emerging trends in tech, you can find him exploring the world or listening to music.