Image: CERN

Yesterday, the European Organization for Nuclear Research (CERN) dropped a staggering amount of raw data from the Large Hadron Collider on the internet for anyone to use: 300 terabytes worth.


The data includes a 100 TB “of data from proton collisions at 7 TeV, making up half the data collected at the LHC by the CMS detector in 2011.” The release follows another infodump from 2014, and you can take a look at all of this information through the CERN Open Data Portal. Some of the information released is simply the raw data that CERN’s own scientists have been using, while another segment is already processed, with the anticipated audience being high school science courses.

CERN is releasing their raw data for some practical reasons: once they’re done with it, they feel that the general public can get just as much use and knowledge out of it:


“Members of the CMS Collaboration put in lots of effort and thousands of person-hours each of service work in order to operate the CMS detector and collect these research data for our analysis,” explains Kati Lassila-Perini, a CMS physicist who leads these data-preservation efforts. “However, once we’ve exhausted our exploration of the data, we see no reason not to make them available publicly. The benefits are numerous, from inspiring high-school students to the training of the particle physicists of tomorrow. And personally, as CMS’s data-preservation co-ordinator, this is a crucial part of ensuring the long-term availability of our research data.”

Opening up the information has some practical scientific uses as well: CERN notes that they’ve already had instances of scientists confirming their results with the same information, or other scientists taking their research in ways that they didn’t initially anticipate.

More than that, it takes the workings of this incredible machine, and puts it into the hands of the public to look at for themselves. While you probably can’t take this information and discover a new particle without some advanced education, it’s information that anyone can play around with, breaking down this barrier between scientists working in a lab and the general public.


[CERN via SlashDot]