More data than you can think of at CERN

Today CERN computer scientists and engineers are pioneering Grid Computing to crunch the gigantic numbers produced daily by what CERN’s director, Rolf-Dieter Heuer, calls “the largest microscope on earth”: the Large Hadron Collider (LHC).

In this gigantic particle accelerator occupying a circular tunnel of 27 km, 100 metres beneath the franco-swiss border, single protons and lead ions started to collide, in 2010, to recreate the conditions of the first nanoseconds following the Big Bang and unravel the secrets of matter. And each of these collisions produces particle shrapnel, captured by detectors giving pictures of them in different states and positions.

But the numbers are staggering. “Just one experiment, such as CMS or ATLAS, produces data from 100 million sensors per detection, while detections happen every 25 nanoseconds during days”, explains Frédéric Hemmer CERN IT Department director. CERN scientists have predicted than when the LHC will be fully operational, it will produce 15 petabytes a year or nearly 1% of all the digital information produced annually on the planet. This is equivalent to a stack of CDs more than 20 km high and has been recorded during the 2010 data taking in expreiment site such as Alice (picture).

CERN cannot handle such quantity of information alone. Therefore, its resources are pooled in a grid of computers with a total capacity of 250,000 processor cores distributed among 151 computing centres in 34 countries. This complex enterprise has already produced results that go beyond physics research. For example, grid software developed by CERN and its partners is used to simulate chemical compounds efficacy against diseases to screen for new drug candidates.

