The Future Is Here
We may earn a commission from links on this page

Class-Action Lawsuit Says Google Stole Everyone's Data to Train Its AI

A 6-year-old, a best-selling author, and others accuse Google of stealing “everything ever shared on the internet" after Gizmodo noted a privacy policy change.

We may earn a commission from links on this page.
A man staring at the Google logo on three screens with green text floating by.
Photo: TY Lim / (Shutterstock)

Google got smacked with a class-action lawsuit Tuesday accusing the search giant of “stealing everything ever shared on the internet,” including copyrighted works and millions of people’s personal data. The law firm behind the case, Clarkson, said the case comes after Google changed its AI privacy policy, an updated first spotted by Gizmodo. The company changed its policy to say it reserves the right to scrape all the internet’s public information to fuel its artificial intelligence projects.

“Google does not own the internet, it does not own our creative works, it does not own our expressions of our personhood, pictures of our families and children, or anything else simply because we share it online,” said Ryan Clarkson, managing partner of Clarkson, in a press release. “We have only recently learned that Google has been taking everything ever created or shared online by millions of internet users, including all our personal information, creative works, and professional works, and using all of that data to train and build commercial AI Products.”


The case comes after a near identical lawsuit against OpenAI, maker of ChatGPT, which was filed by the same firm. Chatbots like ChatGPT and Google’s Bard, not to mention countless other AI endeavors, are trained on the mountains of public information scraped off the internet. Companies like Google feed the data into their AI systems, and the AI produces “new” content based on what it learns. That’s opened a brand new question for courts: when a human studies existing work and gets ideas for new content, that’s perfectly legal. Is it different when a private company plugs information into a database to do the same thing with AI?

So far, it’s up for debate, but the complaint says Google broke copyright law and collected people’s personal information without consent. The plaintiffs, known only by their initials in the lawsuit, include a New York Times best-selling author, a six-year-old boy, a software developer, a TikTok influencer, an actor, and several others.


“We’ve been clear for years that we use data from public sources — like information published to the open web and public datasets– to train the AI models behind services like Google Translate, responsibly and in line with our AI Principles,” said Halimah DeLaine Prado, Google’s General Counsel, in response to the complaint. “American law supports using public information to create new beneficial uses, and we look forward to refuting these baseless claims.”

In Google’s defense, the company’s AI data harvesting wasn’t exactly a secret. Google’s privacy policy used to read that it uses publicly available information to help train “language models” such as Google Translate. The general public, however, may not read that and understand “language models” are a kind of AI. On July 1st, Google updated the policy to spell out its practices and give other examples:

“Google uses information to improve our services and to develop new products, features and technologies that benefit our users and the public. For example, we use publicly available information to help train Google’s AI models and build products and features like Google Translate, Bard, and Cloud AI capabilities.”

However, the lawsuit argues Google harvested the data covertly. “Google harvested this data in secret for years, without providing notice to anyone, much less with anyone’s consent,” Clarkson said.

The suit claims $5 million in damages, and asks the court for a temporary freeze on commercial use of the technology until guardrails are in place, and asks for “data dividends” to be paid to every person whose information was used to develop Google’s AI.