“Google does not own the internet, it does not own our creative works, it does not own our expressions of our personhood, pictures of our families and children, or anything else simply because we share it online,” said Ryan Clarkson, managing partner of Clarkson, in a press release. “We have only recently learned that Google has been taking everything ever created or shared online by millions of internet users, including all our personal information, creative works, and professional works, and using all of that data to train and build commercial AI Products.”
The case comes after a near identical lawsuit against OpenAI, maker of ChatGPT, which was filed by the same firm. Chatbots like ChatGPT and Google’s Bard, not to mention countless other AI endeavors, are trained on the mountains of public information scraped off the internet. Companies like Google feed the data into their AI systems, and the AI produces “new” content based on what it learns. That’s opened a brand new question for courts: when a human studies existing work and gets ideas for new content, that’s perfectly legal. Is it different when a private company plugs information into a database to do the same thing with AI?
So far, it’s up for debate, but the complaint says Google broke copyright law and collected people’s personal information without consent. The plaintiffs, known only by their initials in the lawsuit, include a New York Times best-selling author, a six-year-old boy, a software developer, a TikTok influencer, an actor, and several others.
“We’ve been clear for years that we use data from public sources — like information published to the open web and public datasets– to train the AI models behind services like Google Translate, responsibly and in line with our AI Principles,” said Halimah DeLaine Prado, Google’s General Counsel, in response to the complaint. “American law supports using public information to create new beneficial uses, and we look forward to refuting these baseless claims.”
“Google uses information to improve our services and to develop new products, features and technologies that benefit our users and the public. For example, we use publicly available information to help train Google’s AI models and build products and features like Google Translate, Bard, and Cloud AI capabilities.”
However, the lawsuit argues Google harvested the data covertly. “Google harvested this data in secret for years, without providing notice to anyone, much less with anyone’s consent,” Clarkson said.
The suit claims $5 million in damages, and asks the court for a temporary freeze on commercial use of the technology until guardrails are in place, and asks for “data dividends” to be paid to every person whose information was used to develop Google’s AI.