If you try to think about how big the Internet is, and how much data it contains, the results are mind-boggling. That hasn't stopped the Internet Archive from trying to collect it all though, and now they've hit a big milestone: 10 petabytes. That's 10,000 terabytes, or 10,000,000,000,000,000 bytes. It's a bit.
What are they going to do with all that data, all 10 million gigabytes of it? Well in addition to holding onto it for posterity, they give it back through everyone's favorite time travel device, the Wayback Machine. They're also doing things like providing an experimental 80 terabyte crawl from 2011 to researchers, to see if anyone can do anything cool with it.
10 petabytes seems like a huge milestone, and it is, but with each passing day we're generating more and more content 10 petabytes will seem like nothing 10 years down the line. Even now, companies like Facebook and Google probably have collections of data that rival this, but the Archive's data is at least somewhat curated, not just a pile of crap. Hopefully they can keep pace as the Internet continues to explode. Kind of makes your terabyte porn stash look insignificant, doesn't it? [The Internet Archive via Reddit]