Transforming Drug Discovery at Recursion with DDN AI Platforms
DDN®, the global leader in Artificial Intelligence (AI) and multicloud data management announced its high-performance storage solutions have helped to support broader, faster and more efficient drug discovery research and operations at Recursion, a digital biology company industrializing drug discovery through the combination of automation, artificial intelligence (AI) and machine learning (ML) capabilities to discover novel medicines. Read the Recursion case study that describes the challenges, solution and benefits of the infrastructure, and reveals details of the global-impact work that’s taking place in Salt Lake City.
With most of the industry fighting mounting drug-discovery costs and time-to-market challenges, Recursion uses new approaches combining scientific and technological approaches. This requires high-performance drug discovery processing that is fully optimized for AI and ML and designed to unlock the maximum value of data from the world’s largest repository of biological images.
“Our data is our company, so we needed a robust storage architecture to support our AI-driven models,” said Kris Howard, principal systems engineer at Recursion. “Managing our at-scale data needs requires fast ingest, optimized processing and reduced application run times.”
In collaboration with DDN’s domain experts, Recursion initially created a proof of concept, encompassing DDN’s EXAScaler ES400NV® and ES7990X® parallel filesystem appliances that were later scaled to 2PBs of capacity for staging ML models. An all-flash layer was employed as a front-end to the file system supported by ample spinning disk and the first 64K of each file is stubbed to this layer, which then accelerates access to the first part of the data before streaming the rest to spinning disk.
With DDN, Recursion executes about 350,000 experiments weekly and screens thousands of compounds against hundreds of disease models, now at a fraction of the cost and time of traditional drug discovery methods. DDN’s hybrid high-performance scalable storage solutions, fully optimized for AI and ML, have helped to decrease costs and increase the efficiency of biological research.
“DDN’s reputation as a storage leader is reinforced by our mature solutions and increasing focus on AI data storage,” said Paul Bloch, president and co-founder at DDN. “Leveraging Intelligent Infrastructure to deliver the most comprehensive set of data-centric AI-enabled solutions, DDN’s flexibility in sizing Recursion’s configuration to meet specific workloads has resulted in robust storage that seamlessly supports 18 nodes and 136 GPUs. Being trusted as the storage infrastructure provider for Recursion is a true honor as they work to disrupt traditional drug discovery methods and identify treatments for disease with precision and efficiency.”
While traditional storage architectures would not meet Recursion’s stringent high-performance file processing demands, DDN’s 2PB high-performance, multi-tier data management infrastructure has helped to maximize GPU compute resources for accelerated AI workflows. Not only did this approach deliver extremely fast performance for Recursion’s demanding workloads, it helped to alleviate file-access bottlenecks while enabling efficient streaming to the GPUs.
“Our DDN storage is wicked fast,” says Howard. “The Flash layer resulted in a 40% reduction in file access time, and we can get our GPUs to 100% utilization, and keep them pegged there. It’s highly unusual to train data off a PFS, but it’s a perfect solution for our use case.”