DDN is really sizzling when it comes to weather these days – at UK Met, NCAR, NOAA, TRIUMF, and many more – because the complexity of weather and climate workflows is rising, as is the amount of data computed per run and the number of processes, along with an increase in the number of researchers who concurrently access shared resources. As a result, huge amounts of data grow rapidly, have a very mixed I/O profile upon reaching storage, and need to be accessed very fast. In other words, exactly where DDN is strongest.

Instrument ingest and ensemble modeling get most of the coverage in weather and climate, and this is primarily where DDN technology is utilized. However, one of the fastest growing areas in climate and forecasting infrastructure is the need to get older data that is on an archive tier moved quickly into the production environment. To put it simply, the more historic data that can be included in the analysis, the more accurate the predictive power of the model will be.

The Met Office is the United Kingdom’s National Weather Service, where more than 650 scientists conduct cutting-edge research to help shape global climate change policies while investigating the impact a changing climate may have on energy, health, flooding, farming, water, and food production worldwide. In the process, the Met Office generates about 89TB of archive data per day, a number that is expected to rise to 200TB next year. By 2020, the storage archive itself is expected to reach about 300PB (which is roughly equivalent to 150x all of the data contained in all U.S. academic research libraries [1] or 150 trillion pages of text [2]) as scientists and researchers store more and more data for later analysis.

The archive enables UK Met to store massive amounts of data at considerably lower cost than using fast, tier-1 storage for all their data.  However, archive storage operates at a considerably slower performance than tier-1 storage; thus, a caching layer is needed to stage it into the production environment so that researchers and supercomputers aren’t waiting for it. The caching layer also stages new data sets and results to the archive, feeding it into the archive at a rate that the archive can absorb.

Recently, UK Met considerably upgraded their archive caching tier by adding several systems from DDN, including an SFA12KX™, an SFA7700X flash storage appliance, and a GS12K® easy-deploy-serverless parallel storage appliance. They were very happy with the performance, ease of scale, and reliability of the solution, which led them to select DDN for a new project: a powerful file storage solution to support a scalable compute and storage environment for handling post-HPC model data processing. Called Scientific Processing and Intensive Compute Environment (SPICE), this self-contained platform lets scientists move data from the HPC as well as the MASS archive into a separate system to speed post-processing analysis.

The level of science performed, the number of researchers involved, the sheer size of the data, and the critical services provided, affect the lives of millions of people. Although the infrastructure is probably the least interesting element of the process, without the fast, reliable access to massive data, none of the rest of it would be possible. So, whether you have 200TB or 200PB, if you need fast, reliable, cost-effective access that just scales, check out ddn.com.

  • Laura Shepard
  • Laura Shepard
  • Senior Director of Marketing
  • Date: December 13, 2016