Good artists borrow, great artists steal, as the adage goes. In the cutthroat and almost completely unregulated modern AI industry, there are many tech developers who would probably agree. Few of them would come right out and admit it, though.
Not long after the generative AI boom took the business world and Wall Street by storm, a chorus of complaints started being leveled against companies like OpenAI, Microsoft, Anthropic, and Google, whose models were trained using a sizable chunk of all the content that’s ever been published on the internet—including, as many subsequent lawsuits would allege, massive quantities of copyrighted materials. Confronted by a potentially existential threat to the business model that had sustained them for decades, some major publishers chose to fight back in court. (Others signed content licensing agreements with leading AI labs, trading access to their databases in exchange for a cut of the labs’ profits, custom AI tools, and other perks.) By and large, the AI companies have responded to these allegations by arguing that the scraping of online data is permissible under existing laws around “fair use.”
Given the financial stakes and the novelty of the technology in question, lawyers and judges will have their hands full for some time before such disputes are finally resolved. In the meantime, legal challenges against AI companies are continuing to mount.
On Wednesday, a group of publishers who collectively own close to 400 local and regional newspapers across the country sued OpenAI and Microsoft for what they allege was the “systematic and willful theft of hundreds of thousands of articles” scraped from the internet to train ChatGPT and Copilot. “Those products have generated hundreds of billions of dollars (and counting) in market value for [the] Defendants,” the lawsuit, filed in the U.S. District Court for the Southern District of New York, read. “Not a cent of it has gone to the Publishers whose work made it possible.”
But media companies and artists aren’t the only ones accusing AI companies of stealing their work. Increasingly, accusations are being lobbed between companies themselves—along a distinctly West-East axis.
Also on Wednesday, multiple media outlets reported that Anthropic—currently embroiled in a fresh dispute with the Trump administration over foreign access to its newest models—sent a letter to federal officials accusing the Chinese tech firm Alibaba of “illicitly” using Claude to train a new AI model.
Between late April and early June, according to Anthropic, Alibaba allegedly used nearly 25,000 fraudulent Claude accounts to conduct tens of millions of exchanges with the chatbot, which were used as raw training data for Alibaba’s AI system—a process known in the industry as adversarial distillation. (“Adversarial” in this context doesn’t have any geopolitical connotations, but rather refers to the technical method used to train a new AI model via its interactions with an existing model.)
Anthropic has previously accused Chinese AI startups DeepSeek, Moonshot, and MiniMax of the same thing. (OpenAI has also accused DeepSeek of illicit distillation of its models.) Then, as now, the company didn’t accuse its Chinese competitors of anything that’s definitively illegal; the claim is that this kind of large-scale distillation effort violates the company’s terms of service and warrants a coordinated response across the American public and private sector to prevent Chinese companies from gaining a lead in the much-fretted-over AI race.
And then, as now, Anthropic hasn’t been in particularly good graces with the very government it’s trying to appeal to.
In its new letter directed at Alibaba—which was addressed to Senators Tim Scott and Elizabeth Warren, the chair and ranking member, respectively, of the Senate Committee on Banking, Housing, and Urban Affairs—the company reportedly said it would assist the government in its efforts to prevent these kinds of attacks from happening in the future. In April, White House Office of Science and Technology Policy director Michael Kratsios published a memo stating that the Trump administration would take several steps, including partnering with private companies, to fight what it described as “industrial-scale campaigns to distill U.S. frontier AI systems.”
Kratsios’ memo made a distinction between that kind of mass-distillation—calling out China specifically—and the more small-scale distillation that AI labs routinely use in order to train smaller AI systems using larger, more capable models; not all distillation is illicit, in other words.
But even this standard form of distillation comes with risks. For example, a “student” model trained via interactions with a “teacher” model is likely to inherit some dangerous biases that might be hidden in the training data. Microsoft is therefore hoping to boost the appeal of its new MAI-Thinking-1 model, which was trained “with absolutely zero distillation,” Mustafa Suleyman, head of the company’s AI division, said during the opening keynote at the 2026 Microsoft Build conference earlier this month.
Like publishers’ legal disputes with AI developers, the U.S. AI industry’s efforts to prevent foreign companies from “illicitly” using their models to train new ones will almost certainly not have a quick or easy solution. But one has to suspect that right now, across the country, editors at small-town newspapers are watching American tech companies complain about what they claim amounts to theft, and feeling that at last, a tiny bit of justice has been served.