YouTube Creators Unknowingly Fuel Google’s AI Models

Image by Szabo Viktor, from Unsplash

YouTube Creators Unknowingly Fuel Google’s AI Models

Reading time: 3 min

Google has confirmed to use a subset of YouTube videos to train its artificial intelligence models, which include Gemini and the advanced Veo 3 video generator.

In a rush? Here are the quick facts:

  • Creators were not informed that their videos train AI tools.
  • YouTube terms allow Google to license uploaded content globally and royalty-free.
  • Experts warn AI could compete with creators without consent or compensation.

The news, first reported by CNBC, has sparked criticism from content creators and intellectual property specialists, who worry about their content being used to develop tools that could eventually replace them.

“We’ve always used YouTube content to make our products better, and this hasn’t changed with the advent of AI,” a YouTube spokesperson said to CNBC.

“We also recognize the need for guardrails, which is why we’ve invested in robust protections that allow creators to protect their image and likeness in the AI era,” the spokesperson added.

CNBC reports that YouTube hosts over 20 billion videos. Google, however, has not revealed the specific number of videos it uses for AI training. The article notes that even a 1% selection from YouTube’s vast catalog would still result in billions of minutes of content, which exceeds the training data of most competing AI platforms.

CNBC spoke with several creators and intellectual property professionals who were unaware their content might be used to train AI. “It’s plausible that they’re taking data from a lot of creators that have spent a lot of time and energy and their own thought to put into these videos,” said Luke Arrigoni, CEO of digital identity company Loti. “That’s not necessarily fair to them,” he added.

Google showcased Veo 3 in May through AI-generated cinematic content. Though the company has the legal right under YouTube’s terms of service to use uploaded content, experts like Dan Neely of Vermillio warn that AI-generated tools could directly compete with the creators providing training data.

“We’ve seen a growing number of creators discover fake versions of themselves,” Neely said to CNBC.

Further fueling the debate, an investigation revealed that several major AI firms such as, Apple, Nvidia, Anthropic, and Salesforce, have used transcripts from over 173,000 YouTube videos to train AI models, despite platform policies.

These videos came from more than 48,000 channels, including top creators like MrBeast, PewDiePie, and Marques Brownlee, as well as academic and news institutions such as MIT, Khan Academy, NPR, and the BBC.

The lack of a clear opt-out option, or warning when an AI is scraping content, has prompted creators to demand better transparency and protection for AI training processes.

Did you like this article? Rate it!
I hated it I don't really like it It was ok Pretty good! Loved it!

We're thrilled you enjoyed our work!

As a valued reader, would you mind giving us a shoutout on Trustpilot? It's quick and means the world to us. Thank you for being amazing!

Rate us on Trustpilot
0 Voted by 0 users
Title
Comment
Thanks for your feedback