Atlas Glugged.

by Tuesday, 22 October 2013

(aka: You down with LHC? Yeah, you know me!)

# kill [terrible] [puns]

OK, OK…. I’ll cut to the chase on this one.

In these blogs, with customers, to random strangers – my loyal followers all know that I love to discuss how DDN technology is used to make the world better.  The examples are everywhere, many of which I’ve cited, others that I’ve yet to cite. That said, it’s very rare that an IT company gets to say that it is responsible for uncovering the basic building blocks of our physical existence. Today, I’m happy to declare that DDN systems have been associated with a monumental scientific discovery that has shaped the course of human history and our understanding of basic physics.

Two Three words:  Nobel freakin’ Prize.

First off, congratulations go to Messrs. François Englert and Peter W. Higgs, jointly awarded the 2013 Nobel Prize in Physics. Mr. Englert and Mr. Higgs theorized, independently about the (now called) Higgs particle and the origin of mass of subatomic particles.  The discovery of the Higgs particle was the product of decades of theorization and research by several independent research teams – where the Higgs theories were ultimately confirmed only through the construction of one of the largest scientific instruments of all time1, the Large Hadron Collider (LHC).

With a total budget of $9B, the LHC stands tall as the highest energy particle collider in existence.  The collider is 27km in circumference and even crosses borders.  More importantly, this feat of mankind is capable of 600 million particle collisions per second.  One of those happened to be caught in CERNs large scale data storage network and analyzed on some very high powered computing equipment.  If you’re looking for the sonogram of the baby Higgs, look no further than here:

Visualization of ‘the God Particle.’ So cute. Compliments of CERN (creative commons).

600 million collisions per second.  ALICE is one of 4 experiments that the LHC supports, and it has dedicated infrastructure (sensors, processors, etc.) to help it track and understand scientific phenomena specifically as it relates to the discovery of the Higgs Boson field/particle. It turns out this is a fairly data-intensive endeavor.

I’m just going to quote Wikipedia on this one, because I can’t write this stuff better myself “The detector generates unmanageably large amounts of raw data: about 25 megabytes per event (raw; zero suppression reduces this to 1.6 MB), multiplied by 40 million beam crossings per second in the center of the detector. This produces a total of 1 petabyte of raw data per second.”

Enter DDN. For over half a decade, DDN has been fortunate to be selected for its enterprise-class, highly-available systems by LHC partners and high energy physics researchers that require a balance between $/performance and $/capacity. More than cost efficiency, these customers also value operational uptime – and make the choice highly available platforms for the non-stop processing of fresh data from the main CERN facility.  These 11 x Tier-1 and over ~150 x Tier-2 LHC partners are actually responsible for the bulk of interpreting data from CERN experiments.  CERN, Triumf, SARA, BNL, Desy, IN2P3, KIT, NERSC and many, many other organizations are examples of sites that have selected DDN technology throughout the years.  As CERN completes their LHC upgrade and gets the collider back online, we look forward to onboarding many more customers – especially since the collider will be generating twice the data that it did previously, humbling lesser capable storage platforms.

The math, for these organizations, is fairly simple when it comes to selecting DDN technology:  Our appliances can support, today, up to 1,680 hard disks in a SFA system.

- We can house over 3PB in a data center rack, minimizing data center sprawl

- We dilute the appliance cost of ownership across over 1500 commodity HDDs

- We support HEP-specific file systems, like dCache, as well as parallel file systems

- Our systems are highly robust, and can support any amount of failures while most times not even exposing the application environment to performance degradation

Net:Net – researchers can focus on research, not IT issues or budgetary challenges that are caused by the explosion of massive data sets.

I’ll let Jos van Wezel, from the Karlsruhe Institute of Technology (KIT), a CERN Tier-1 partner, valued DDN customer and proud owner of over 15PB of DDN storage, tell the story better than I ever could … see below from his talk at our June ’13 User Group.


So – what’s the point?

Well, I’m afraid I don’t have a profound thesis for this post, I simply want to stop and recognize the value of scientific computation and data analysis as the next pillar of the scientific method. I’d also like to congratulate all of the CERN partners who contributed in this wonderful discovery. Finally, I’d like to reflect on the emergence of a new era where 600 million ‘collisions’ easily happen every second.

This type of data creation isn’t exclusive to CERN. From Wall St. to Twitter, from genomics to video surveillance – sensors are everywhere.  Big data storage is a big part of the overall equation, and DDN is driving our innovation forward very rapidly these days to make sure you don’t need to be a Nobel Laureate to figure out how to extract value from data and accelerate insights.