The New York Times is currently suing OpenAI for copyright infringement and claims that the influential tech startup used its journalistic material to train its chatbot, ChatGPT, without paying the proper licensing fees. But Sam Altman’s company is fighting back against these accusations with some accusations of its own. This week, OpenAI claimed that the newspaper had “hacked” its products.
In a legal filing made public this week, OpenAI claimed that its products had been abused by “someone” who was paid by the New York Times to do so. In its own words, the company claimed:
“The allegations in the Times’s complaint do not meet its famously rigorous journalistic standards. The truth, which will come out in the course of this case, is that the Times paid someone to hack OpenAI’s products. It took them tens of thousands of attempts to generate the highly anomalous results that make up Exhibit J to the Complaint. They were able to do so only by targeting and exploiting a bug (which OpenAI has committed to addressing) by using deceptive prompts that blatantly violate OpenAI’s terms of use. And even then, they had to feed the tool portions of the very articles they sought to elicit verbatim passages of, virtually all of which already appear on multiple public websites.”
It’s not entirely clear what OpenAI is talking about. If I had to guess, it sounds like the New York Times hired a contractor to see whether they could make ChatGPT reproduce their reporting. That said, it’s not clear that’s the case. Gizmodo reached out to OpenAI for clarification and will update this story when we receive a response.
When reached for comment, the New York Times provided a statement through Ian Crosby, Susan Godfrey partner and lead counsel for the paper. The statement reads, in part:
“Building new products is no excuse for violating copyright law, and that’s exactly what OpenAI has done on an unprecedented scale.
“In this filing, OpenAI doesn’t dispute – nor can they – that they copied millions of The Times’s works to build and power its commercial products without our permission.
“What OpenAI bizarrely mischaracterizes as ‘hacking’ is simply using OpenAI’s products to look for evidence that they stole and reproduced The Times’s copyrighted works. And that is exactly what we found. In fact, the scale of OpenAI’s copying is much larger than the 100-plus examples set forth in the complaint.
“It should be no surprise to OpenAI that illegal copying and misinformation are core features of their products and not the result of fringe behavior.
“OpenAI, which has been secretive and has deliberately concealed how its products operate, is now asserting it’s too late to bring a claim for infringement or hold them accountable. We disagree. It’s noteworthy that OpenAI doesn’t dispute that it copied Times works without permission within the statute of limitations to train its more recent and current models.”
OpenAI has built its business off of scraping huge swaths of the internet. Artists, authors, journalists, and filmmakers have all had their work hoovered up by the company’s web scrapers; that work has then been used to train the company’s high-octane, content-generating algorithms. Many creatives have decided to sue the company.
That said, many of those lawsuits have floundered, so far. The Times’ lawsuit has been deemed one of the most promising legal attacks on the AI industry’s business model, which some critics have referred to as “theft.” OpenAI has continually attempted to get the newspaper’s lawsuit thrown out of court.