Nvidia Downloaded a ‘Human Lifetime’ Amount of Video Daily to Train Chips
To compile training data for AI chips, Nvidia was downloading 80 years’ worth of video daily off of YouTube, Netflix, and academic databases.
Sign up for smart news, insights, and analysis on the biggest financial stories of the day.
Despite weeks of Wall Street cold-waterboarding AI, Nvidia somehow kept dry — until now. A bad day for the stock market was even worse for the chip juggernaut. And it had a bad PR day to boot.
Think the teenage boy in your life watches a lot of YouTube? He’s got nothing on Nvidia, except perhaps the law. To compile training data for its AI chips, Nvidia was downloading 80 years’ worth of video per day off of YouTube, Netflix, academic databases, and other sources in what may be serial copyright infringement, according to internal documents seen by 404 Media.
Training Montage
The market meltdown is in part a course correction to human hallucinations about AI’s near-term prospects. None of this explains why it never occurred to Nvidia to ask for permission — or, umm, a going rate — before compiling datasets that included the links to as many as 130 million YouTube videos. In the process, per documents and Slack messages seen by 404 Media, Nvidia employed IP address-hiding tools to circumvent YouTube’s anti-scraping firewalls, while also tapping datasets previously compiled by academics that were designated for research-only purposes.
When staffers raised legal or ethical concerns, they were told they had “umbrella approval” from the highest level of the company to use the content, 404 Media reports. One vice president-level employee even suggested downloading “the whole Netflix too.” Think of the popcorn. No, think of the lawyers:
- “Copyright law protects particular expressions but not facts, ideas, data, or information. Anyone is free to learn facts, ideas, data, or information from another source and use it to make their own expressions. Fair use also protects the ability to use a work for a transformative purpose, such as model training,” a Nvidia spokesperson told 404 Media.
- Meanwhile, a Google spokesperson referred to a previous statement that called such actions a “clear violation” of its terms of use, a sentiment shared by a Netflix spokesperson.
Of course, all of this is coming out as Nvidia suffers more than a scrape from investors. The company saw a share price skid of nearly 7% Monday, helping to fuel a nearly 3% downturn of the tech-heavy Nasdaq 100. Nvidia’s share price is still up over 100% year-to-date.
We’re Thriving: Not everyone is an AI pessimist just yet. On Monday AI chip startup Groq raised a $640 million Series D led by BlackRock at a $2.8 billion valuation, or nearly three-times greater than its previous valuation. Meanwhile, Joshua Kushner’s Thrive Capital, which has previously invested in OpenAI, raised $5 billion for its largest pair of venture capital funds to date, per a Wall Street Journal report.