Ex-OpenAI staffer claims the ChatGPT maker leverages "the fair use doctrine" to violate copyright law and destroy the internet — after Sam Altman admitted it's impossible to develop AI tools without copyrighted material

In this photo illustration OpenAI icon is displayed on a mobile phone screen in Ankara, Turkiye on August 13, 2024.

(Image credit: Getty Images | Anadolu)

What you need to know

A former OpenAI employee recently published a blog post highlighting the firm's transgressions, including breaking copyright law by using internet data to train ChatGPT.
The report suggests OpenAI relies on technicalities in copyright law to continue using copyrighted content and internet data to train AI models without authorization or compensation.
He also highlighted AI-generated content's role in ruining the internet, including inaccurate information.

Amid bankruptcy reports and efforts to restructure its business model into a for-profit venture, high-profile employees continue to depart from OpenAI. Suchir Balaji recently left OpenAI to work on "personal projects."

Balaji joined the ChatGPT maker shortly after graduating from UC Berkeley, hoping to be part of the team that leverages generative AI's cutting-edge capabilities to cure diseases and potentially stop aging. He predominantly worked on OpenAI's GPT-4 model, described as "mildly embarrassing at best," with Sam Altman admitting that it "kind of sucks."

However, the 25-year-old departed from the AI firm after realizing his goals weren't aligned with the company's. While speaking to the New York Times, Balaji indicated:

"AI companies are destroying the commercial viability of the individuals, businesses, and internet services that created the digital data used to train these A.I. systems."

He blatantly claimed OpenAI breaks the U.S. copyright law, a serious allegation coming from someone who's worked at the company. This isn't the first time OpenAI has been under fire for copyright infringement issues. The ChatGPT maker is fighting several copyright infringement lawsuits in court alongside Microsoft.

OpenAI CEO Sam Altman previously admitted developing tools like ChatGPT is virtually impossible without copyrighted content. He added that copyright law doesn't categorically prohibit training AI models using copyrighted content.

Is AI model training using copyrighted content fair use?

OpenAI logo

OpenAI logo (Image credit: Getty Images | NurPhoto)

In Balaji's blog, he attempted to highlight how OpenAI was breaking copyright law. Through his analysis, the former OpenAI staffer established that the information generated using ChatGPT doesn't meet the "fair use" threshold. For context, "fair use" is a standard set that warrants limited use of copyrighted content without the author's accent.

Following Balaji's copyright infringement claims, OpenAI issued the following statement to Gizmodo:

“We build our A.I. models using publicly available data, in a manner protected by fair use and related principles, and supported by longstanding and widely accepted legal precedents. We view this principle as fair to creators, necessary for innovators, and critical for US competitiveness.”OpenAI and Microsoft constantly argue that using copyrighted content from the internet to train their AI models falls under fair use. However, Balaji seems to have a different opinion. While he admits that the information generated from the AI systems isn't directly lifted from the source, it's not original either. Balaji argues that AI-generated content is reminiscent of copyrighted material, and by this standard, it is illegal under copyright law.

Aside from his copyright concern, Balaji highlighted his concerns over the potential impact of AI tools like ChatGPT on the internet. A former Google Engineer warned that OpenAI's temporary prototype search tool, SearchGPT, could potentially give Google a run for its money in the foreseeable future amid antitrust regulation after being classified as an illegal monopoly in search. He also highlighted AI is prone to generating inaccurate and misleading information. “If you believe what I believe, you have to just leave the company,” Balaji added.

🎃The best early Black Friday deals🦃

📺LG Curved OLED Monitor (32-inches) | $839.99 at Amazon (Save $660!)
🎮Amazon Fire TV Xbox Game Pass bundle | $74.99 at Amazon (Save $62!)
💻Alienware m16 R2 (RTX 4060) | $1,399.99 at Dell (Save $300!)
🔊2.1ch Soundbar for TVs & Monitors | $44.99 at Walmart (Save $55!)
💻HP OMEN Transcend 14 (RTX 4050) | $1,099.99 at HP (Save $500!)
🎧Sennheiser Momentum 4 ANC | $274.95 at Amazon (Save $125!)
📺LG C4 OLED 4K TV (42-inches) | $999.99 at Best Buy (Save $400!)

TOPICS

Kevin Okemwa is a seasoned tech journalist based in Nairobi, Kenya with lots of experience covering the latest trends and developments in the industry at Windows Central. With a passion for innovation and a keen eye for detail, he has written for leading publications such as OnMSFT, MakeUseOf, and Windows Report, providing insightful analysis and breaking news on everything revolving around the Microsoft ecosystem. While AFK and not busy following the ever-emerging trends in tech, you can find him exploring the world or listening to music.