Ever put content on the web? Microsoft says that it's okay for them to steal it because it's 'freeware.'

Image of Mustafa Suleyman, Microsoft AI CEO
This is a "freeware" image from Mustafa Suleyman's personal website that has been reproduced per the guidance of the Microsoft AI CEO. (Image credit: Mustafa Suleyman)

What you need to know

  • Microsoft's AI CEO claimed that content shared on the web is "freeware" that can be copied and used to create new content.
  • The remarks centered around Microsoft and other companies using preexisting content to train AI models.
  • The CEO claimed that there's a separate category of content that cannot be used to train AI, which is indicated by an organization explicitly stating "do not scrape or crawl me for any other reason than indexing me so that other people can find that content."

Microsoft may have opened a can of worms with recent comments made by the tech giant's CEO of AI Mustafa Suleyman. The CEO spoke with CNBC's Andrew Ross Sorkin at the Aspen Ideas Festival earlier this week. In his remarks, Suleyman claimed that all content shared on the web is available to be used for AI training unless a content producer says otherwise specifically.

"With respect to content that is already on the open web, the social contract of that content since the 90s has been that it is fair use. Anyone can copy it, recreate with it, reproduce with it. That has been freeware, if you like. That's been the understanding," said Suleyman.

"There's a separate category where a website or a publisher or a news organization had explicitly said, 'do not scrape or crawl me for any other reason than indexing me so that other people can find that content.' That's a gray area and I think that's going to work its way through the courts."

Suleyman's quote raises several questions:

  • Is it actually okay to use other people's work to create new content?
  • If so, is it okay to profit off those recreations or work derivative of preexisting content?
  • How could websites and organizations "explicitly" say that their work cannot be used for AI training before AI became commonplace?
  • Has Microsoft respected any organization that specified content should only be used for search?
  • Have Microsoft's partners, including OpenAI, respected any demands that content not be used for AI training?

Several ongoing lawsuits suggest that publishers do not agree with the take of Suleyman.

Training vs. stealing

Generative AI is one of the hottest topics in tech in 2024. It's also a hot button topic among creators. Some claim that AI trained on other people's work is a form of theft. Others equate training AI on existing work to artists studying at school. Contention often circles around monetizing work that's derivative of other content.

YouTube has reportedly offered "lumps of cash" to train its AI models on music libraries from major record labels. The difference in that situation is that record labels and YouTube will have agreed to terms. Suleyman claims that a company could use any content on the web to train AI, as long as there was not an explicit statement demanding that not be done.

Microsoft and OpenAI have been on the receiving end of several copyright infringement lawsuits. Eight US-based publishers filed suits against OpenAI and Microsoft, joining The New York Times, which already had an ongoing suit.

AI-generated content is controversial in ways other than its source material. An animated video stirred up Pink Floyd fans when it became a finalist in an animation competition.

Assuming I've understood Suleyman correctly, the CEO claimed that any content is freeware that anyone can use to make new content, unless the creator says otherwise. I'm not a lawyer, but Suleyman's claims sound a lot like those viral chain messages that get forwarded around Facebook and Instagram saying, "I DO NOT CONSENT TO MY CONTENT BEING USED." I always assumed copyright law was more complicated than a Facebook post.

CATEGORIES
Sean Endicott
News Writer and apps editor

Sean Endicott is a tech journalist at Windows Central, specializing in Windows, Microsoft software, AI, and PCs. He's covered major launches, from Windows 10 and 11 to the rise of AI tools like ChatGPT. Sean's journey began with the Lumia 740, leading to strong ties with app developers. Outside writing, he coaches American football, utilizing Microsoft services to manage his team. He studied broadcast journalism at Nottingham Trent University and is active on X @SeanEndicott_ and Threads @sean_endicott_. 

  • Arun Topez
    That caption under the article image is excellent 🤣

    This is just another example that Microsoft don't give af about user's privacy or content. Users don't have the luxury of having big lawyers and patents and copywrite protections that Microsoft themselves have. The fact that he also said that it's a "gray area" we're exploring regarding organizations who explicitly put do not crawl on their site, also shows their lack of care. Hopefully social media outlets will do something about this to protect user generated content from being used as "freeware" (even though that analogy is terrible, it's more stealing people's work from someone's gallery or notebook).

    AI should be assisting users with fixing/adjusting content and replacing tedious repetitive tasks, not primarily as generating content ripped off from other people, and replacing what people enjoy doing - expressing their creativity through their medium.
    Reply
  • Opinion
    This person joined Microsoft just recently. It's sad to see how the first added value he brough seems to be this mindset.
    Reply