• DDN | BLOG
    • THINK BIG
    • INSIGHT & PERSPECTIVES ON
      LEVERAGING BIG DATA

From treating cancer and enhancing lives with personalized medicine to enabling sustainable food production, it is even more important for life science researchers to choose the right technology provider. As a scientist, you want technology partners who understand your organization’s demands and let you focus on research problems while they address the exploding infrastructure demands that distract you from your primary objective.

Data Driven Research Image

At most life science organizations, data-driven research is grouped into three major categories: (a) genome sequencing, (b) modeling/simulation, and (c) instrumented data analysis. Whether you are analyzing genomic data, developing pharmaceutical or agricultural products, or performing life-saving clinical research, you are likely engaged in one of these three areas.

Genomics is the sequencing of DNA or RNA and analyzing the resulting bioinformatics data. Modeling and simulation includes computational chemistry, molecular dynamics, and computer-aided drug design. Some clients combine genotypic with phenotypic data from different instruments and sources to test new hypotheses. Furthermore, advances in sensor equipment with higher resolutions and frame rates also generate massive amounts of data in a short time. Regardless of the process, analytics resources are struggling to keep up with all these data requirements.

Shared Life Science Technology Vision

DDN and IBM have a joint history of more than a decade of delivering high performance computing solutions. Both companies share a vision for high performance computing (HPC) in the most demanding environments. Combined with the IBM Spectrum ScaleTM parallel file system and DDN’s GRIDScaler® family of high performance servers, this solution supports production systems with more than 30 PB of data in a single namespace, more than 10,000 node clusters, and greater than 400 GB/s throughput in a single file system. Both companies can help life science researchers working on groundbreaking projects deliver actionable results in far less time, and make a large and lasting impact to society.

Spectrum Scale in Life Sciences Image

Whether it’s pure research or handling clinical and patient data with maximum security, collaboration, and encryption, GRIDScaler and IBM Spectrum Scale solutions help life science researchers focus on science while DDN and IBM create infrastructure that scales.

Top Life Science Challenges

IBM and DDN see the following challenges. It has become increasingly difficult to process and store growing volumes of complex data and manage the accompanying infrastructure. From next generation sequencing, biomedical imaging, and electronic medical records to curating scientific literature and analyzing instrumentation data, we see clients struggling with a very large volume, variety, and velocity of data.

Challenge #1: Large-Extreme Data

In fact, some organizations report their scientists are spending up to 80% of their time manually integrating data silos and less than 20% deriving insights from their analysis. Finally, collaboration is critical. Many organizations now require data sharing across international boundaries. In one client’s case, governmental regulations mandate that companies must use highly secure and encrypted infrastructure when sharing data between companies or within divisions of the same company.

Researchers Focus on Science, Not Infrastructure

Scientists care about application performance, not infrastructure. For example, at the Max Delbruck Center for Molecular Medicine in Berlin, researchers observed a 7x performance increase while using the Variant Calling with Genome Analysis Toolkit (GATK) when running on a DDN SFA12KE® with IBM Spectrum Scale, compared to an NFS server. The difference was dramatic: the job completed in 5 hours compared to 36 hours, allowing researchers to speed their analysis and accelerate their research accordingly.

Furthermore, scientists at The Wellcome Trust Centre for Human Genetics at Oxford University found that DDN GRIDScaler and IBM Spectrum Scale enabled the research team to focus on their critical work instead of supporting open source solutions. IBM believes that clients need to appreciate fully the hidden costs of open source alternatives, including providing support under critical discovery timelines. According to Wellcome Trust, since both DDN GRIDScaler and Spectrum Scale are well supported and documented, the research team could focus on their research instead of supporting IT storage infrastructure.

IBM and DDN at Bio-IT World

Bio-IT World is a key event for IBM Storage and other divisions. At this year’s event, IBM is sponsoring the DDN Best Practices for Big Data in Life Sciences Workshop on May 23 on behalf of IBM Spectrum Scale. IBM Cloud Object Storage and Aspera, an IBM Company, will also be present, as well as IBM Spectrum LSF, a premier cluster workflow and virtualization solution which helps life science clients gain the highest utilization from their computing infrastructure.

Please visit DDN Booth #357 at Bio-IT World, where our representatives will describe how to make the best use of your time and accelerate your time to insight.

  • Peter Basmajian