Skip to content
Tech News

The Library of Congress Can’t Quite Handle That Massive Tweet Archive It Was Trying to Build

By

Reading time 1 minute

Comments (0)

A few years ago, the Library of Congress announced its plans to create an archive of every public tweet ever. If you thought that sounded a little bit optimistic, you’d be right; the Library of Congress released a white paper today explaining why they can’t quite pull it off.

https://gizmodo.com/your-past-and-future-tweets-will-be-archived-at-the-lib-5517180

It’s not all bad. The Library of Congress has an archive, and you can search it. The situation just isn’t optimal, so they’ve had to turn away over 400 researchers who’ve requested access. Due to the Library’s agreement with Twitter, the archive could never have been fully public anyway, but the collection as-is chugs under even the lightest stress. A single query can take about 24 hours, and that’s only for the tweets from 2006-2010.

In order provide a search that’s a little more useful, the Library of Congress says it’d need a lot more resources.

To achieve a significant reduction of search time, however, would require an extensive infrastructure of hundreds if not thousands of servers. This is cost prohibitive and impractical for a public institution.

The upside to all this is that it seems the Library of Congress is succeeding in its most basic goal of actually archiving tweets. But that’s of questionable worth if no one can really find what they’re looking for. Fortunately you can archive your own tweets now, but if you’re looking for a searchable database of the whole tweetverse, it seems you’ll have to keep on waiting. [The Library of Congress via Buzzfeed]

https://gizmodo.com/you-can-now-download-your-entire-twitter-history-maybe-5968814

Image by Biehler MichaelShutterstock

Explore more on these topics

Share this story

Sign up for our newsletters

Subscribe and interact with our community, get up to date with our customised Newsletters and much more.