The Translational Genomics Research Institute
TGen: Deploying Extreme Storage from DDN to unravel the genetic components of disease
Founded in 2002, The Translational Genomics Research Institute (TGen) is on the cutting edge of translational genome research. By combining revolutionary technology with basic science, TGen aims to identify the genes that play a role in disease development and evolution.
Leveraging high-throughput gene sequencing technologies, translational genomics research is the understanding of genomic variation and translating that understanding to the diagnosis and treatment of disease in a manner tailored to individual patients.
High-throughput sequencers parallelize the gene sequencing process and produce millions of sequences at once. As these machines become more powerful, individual gene sequencing and analysis enables affordable patient specific diagnosis and disease treatment. Additionally, nextgeneration gene sequencing technologies, such as the SOLiD System from Life Technology - ABI, are capable of delivering as much as 10 times the amount of sequence reads as compared to the Sanger sequencing systems. This new level of resolution has resulted in a sea change in processing and storage methodologies when serving and storing next-generation sequence data.
"Before we received the DDN ExaScaler S2A9900 system, TGen was not able to support this level of next-generation sequence alignment - our existing systems could simply not deliver enough performance."
- James Lowey, TGen Director of HPC
To increase TGen's research capabilities and revolutionize the translational process within TGen, a multi-phase approach was taken to upgrade the scientific data processing of the genomics workflow:
- First, TGen set out to upgrade their computation environment to accelerate their sequence alignment processes. In partnership with ASU and Dell Computer, TGen deployed a high-performance computing cluster consisting of 4,680 CPU cores and high-speed InfiniBand networking.
- Second, in anticipation of the data processing workload when aligning nextgeneration sequence data, TGen needed to upgrade the cluster's file storage capabilities. Previously using generic SAN storage from LSI Logic, TGen selected DDN's high-performance ExaScaler S2A9900 Parallel File Storage system to deliver over 5GB/s data transfer, and handle TGen's performance & scaling requirements. The system was delivered by Dell and integrated directly into the cluster's high-speed InfiniBand fabric.
- Finally, TGen's next-generation sequencers feed the cluster and storage system with large volumes of sequence data. TGen has one SOLiD Systems running and is in process of adding an additional four new SOLiD sequencers to increase workflow output and lower sequence run costs.
The Challenge
TGen has been a pioneer in the cluster computing industry for over 5 years. Challenged to scale file I/O performance to keep up with nextgeneration sequencing technology, TGen required a new storage approach and scalable computing cluster for genome sequence alignment.
Solution
TGen deployed a high-performance DDN ExaScaler File Storage system featuring the DDN S2A9900 storage system and the Lustre File System.
By deploying the DDN ExaScaler solution, TGen eliminated I/O bottlenecks and now has in place a storage solution for today's and tomorrows' sequence alignment.
About TGen
The Translational Genomics Research Institute (TGen) is a non-profit organization dedicated to the performance of groundbreaking research with life changing results.
Research at TGen is focused on developing earlier diagnostics and smarter treatments for diseases such as cancer, neurological disorders and diabetes.
Arizona State University is partnered with TGen to provide a state-of-the-art computational facility as ASU's support for TGen's genomics research mission.
High performance computing (HPC) clusters are used in high throughput translational research for study of structural and functions of DNA, RNA, & proteins to identify variations and similarities and detect patterns which may be linked to disease. Being an industry pioneer, TGen has always been tasked with pushing the scaling boundaries of cluster computing and life sciences. "TGen has always been on the cutting edge of technology, which is required to keep up with and advance scientific discovery," says James Lowey, who manages TGen's high performance computing division.
After working with traditional storage systems, TGen quickly understood that traditional storage methods were not appropriate for servicing a 4,000+ core cluster or managing the capacity explosion associated with bringing all of the new SOLiD systems online. When deciding on what direction to take, TGen turned to Dell to deliver a next-generation storage cluster. Since DDN is Dell's leading partner for performance and capacity focused data storage installations - DDN was the natural choice for this project.
To satisfy TGen's I/O requirements, DDN delivered a high-performance file storage system that has been deployed on many of the world's largest HPC system - the ExaScaler S2A9900 file storage system. This technology combines the award-winning S2A9900 SAN Storage System with the Lustre File System to provide unrivaled cluster file storage performance and cost-effective HPC storage. The ExaScaler S2A9900 is capable of delivering up to 200GB/s of single file system bandwidth, power over 200,000 compute CPUs and store up to 10 Petabytes of file system capacity - allowing TGen to grow in any direction without requiring a forklift upgrade. Additionally, DDN's extensive HPC experience was critical in rapid system deployment and performance optimization. TGen now has infrastructure they scales easily as science accelerates- allowing them to focus on translational research, not computer design. "We chose DDN because we can continue to scale with the S2A & Lustre, is critically important and the system is superior to all others we evaluated," said James. "I've been impressed with the product and look forward to working with DDN going forward - anything that simplifies cluster storage is definitely appreciated at TGen."
For a PDF version of this case study click here.