YouTube says OpenAI training Sora with its videos would break the rules


  • AI
  • Friday, 05 Apr 2024

Using YouTube videos to train OpenAI's text-to-video generator would be against the terms of service of the platform, says Mohan. — Image by rawpixel.com on Freepik

The use of YouTube videos to train OpenAI’s text-to-video generator would be an infraction of the platform's terms of service, YouTube chief executive officer Neal Mohan said.

In his first public remarks on the topic, Mohan said he had no firsthand knowledge of whether OpenAI had, in fact, used YouTube videos to refine its artificial intelligence-powered video creation tool, called Sora. But if that were the case, it would be a “clear violation” of YouTube’s terms of use, he said.

“From a creator’s perspective, when a creator uploads their hard work to our platform, they have certain expectations,” Mohan said Thursday in an interview with Emily Chang, host of Bloomberg Originals.

“One of those expectations is that the terms of service is going to be abided by. It does not allow for things like transcripts or video bits to be downloaded, and that is a clear violation of our terms of service. Those are the rules of the road in terms of content on our platform.”

There has been much public debate over what material OpenAI uses to train the AI models underlying popular content creation products such as ChatGPT and DALL-E. Sora and other generative AI tools work by sucking up all sorts of content from around the web and using that data as the foundation from which the tools can generate new content, including videos, photos, narrative text and more.

As companies like OpenAI, Google and others race to develop more powerful artificial intelligence, they are looking to source as much content as possible to train their AI models to get better quality results. Google and YouTube are units of Alphabet Inc.

OpenAI, which is backed by Microsoft Corp, didn’t immediately respond to a request for comment. OpenAI chief technology officer Mira Murati said in an interview with the Wall Street Journal last month that she wasn’t sure whether Sora was trained on user-generated videos from YouTube, Facebook and Instagram.

The Journal reported this week that OpenAI has discussed training its next-generation large language model, GPT-5, on transcriptions of public YouTube videos, citing people familiar with the matter.

Mohan said Google adheres to YouTube’s individual contracts with creators before deciding whether to use videos from the platform in training the company’s own powerful AI model, Gemini.

“Lots of creators have different sorts of licensing contracts in terms of their content on our platform,” Mohan said. Though “some portion of that YouTube corpus maybe being used” to train models like Gemini, Google and YouTube ensure that using the videos as training data for Google’s AI is “in concert with whatever the terms of service or the contract that that creator has signed” beforehand, he said. – Bloomberg

Follow us on our official WhatsApp channel for breaking news alerts and key updates!

   

Next In Tech News

Bitcoin breaches $94,000 for the first time
How to save your Google Maps location history
Kadokawa shares jump 16% after news of Sony talks
Some US shoppers are considering making big purchases like cars and iPhones before Trump's proposed tariffs. Is it necessary?
Qualcomm expects $12 billion in revenue from autos, PC chips in five years
Dell, Iron Bow settle charges they overcharged the Army, DOJ says
Senator says Trump cannot ignore law requiring ByteDance to divest TikTok by next year
Opinion: Finding peace in the age of the smartphone
Santander launches fintech in Mexico to expand digital services
Andreessen Horowitz-backed studio Promise to start producing movies, series using AI

Others Also Read