Scribd has become a popular place for people to upload PDF documents to share with the world. But the hosting site’s algorithms aren’t handling the release of the Mueller Report very well, according to a new report from Quartz. Scribd’s robots are flagging the document as a copyrighted work, despite the fact that, as a government-produced document, it’s in the public domain.
The Mueller Report, which documents the Trump regime’s many contacts with Russian government associates in the lead up to the 2016 presidential election as well as Trump’s efforts to impede the investigation against him, was offered for free, but some publishers are selling it as a book.
It’s not clear why Scribd’s algorithms are flagging the document, and the San Francisco-based company did not immediately respond to Gizmodo’s request for comment on Friday afternoon. But Scribd’s official Twitter account was sharing the report yesterday and lots of people were uploading the document after it hit the U.S. Justice Department’s website at precisely 11:00 am ET yesterday.
The Justice Department’s servers were apparently overloaded by the deluge of interest, as many people reported not being able to download the document after it was first released. To make matters even more complicated, the DOJ’s version of the document wasn’t searchable, so it became even more important that the historic 448-page report be uploaded in multiple places in a format that had been converted using OCR software.
Quartz reports that its own version of the document was flagged because of “Scribd’s BookID copyright protection system” which “disabled access” to their upload.
The automated email that Quartz received acknowledged that their copyright robocops, “will occasionally identify legitimate content as a possible infringement.” Quartz reports that its upload of the Mueller Report was reinstated quickly after the news site protested.
Other internet platforms like YouTube have similar measures in place to flag materials that might violate copyright. But it’s not always the robots that are taking down perfectly acceptable material. Speaking from personal experience, I regularly have copyright trolls that claim ownership of public domain, government-created videos that I upload to my personal YouTube account. And there’s basically nothing I can do aside from contest the copyright claim and spend time providing evidence that it’s in the public domain.
Scribd is great, but there are other options for uploading documents that won’t be taken down by automated systems. I’m a big believer in the Internet Archive, which introduced embed features in recent years, one of the reasons that people loved Scribd when it first launched in 2007. But you don’t need Scribd anymore. And there are plenty of easy-to-find copies of the Mueller Report over at Archive.org.
Update: Scribd sent Gizmodo a message via email that acknowledges the Mueller Report was uploaded by a major publisher, which is why its automated system believed that it was a copyrighted work. The publisher is unnamed, but outlets like the Washington Post are selling copies of the report that contain analysis by its writers.
You can read Scrib’s full statement below.
Scribd’s subscribers have access to the largest library of digital content, which includes millions of books, audiobooks, magazine and newspaper articles from the world’s leading publishers as well as original and public domain content uploaded by community of over 100 million users from 194 countries.
We believe in everyone’s right to share their stories, and our open publishing platform encourages the exchange of ideas through user-contributed documents that range from study guides and dissertations to court cases and official government memos.
We take copyright infringement seriously and have a team dedicated to addressing any concerns. We also have policies and technologies that help us to provide protection, and we encourage our subscribers to help us by reporting any abuses.
As an additional layer of protection, Scribd developed an automated copyright protection system, called BookID, to combat copyright infringement on the platform. The system uses algorithms to compare reference samples of copyrighted works provided by publishers with materials uploaded by users. It is widely used by authors and publishers around the world to identify unauthorized use of their works online.
Yesterday upon its release, several individuals and organizations shared The Mueller Report with their communities using Scribd’s open publishing platform. Later in the afternoon, a leading global publisher released the Report in book format, and our BookID system misidentified the content as copyright protected. As such, user uploaded copies were removed from our system.
Upon learning of the removal, the Scribd team immediately reinstated all user-uploaded versions of The Mueller Report, and it has been correctly earmarked as public domain so that automatic removals through BookID will cease.
We are thankful to our community members who alerted us to the system’s error, and we look forward to continuing to serve our communities of readers who use Scribd to share ideas with the world and publishers and authors who trust us to protect their copyrighted content.