The NVIDIA DGX SuperPOD™ with NVIDIA DGX™ A100 systems is a next generation state-of-the-art artificial intelligence (AI) supercomputing infrastructure that delivers groundbreaking performance, deploys in weeks as a fully integrated system, and is designed to solve the world's most challenging AI problems.

The groundbreaking performance delivered by the DGX SuperPOD with DGX A100 systems enables the rapid training of deep learning (DL) models at great scale. The DGX SuperPOD set records for all eight MLPerf benchmarks at scale. To support this record-breaking performance, the DGX SuperPOD must be paired with a fast storage system.

In this paper, the DDN® A³I AI400X appliance was evaluated for suitability for supporting DL workloads when connected to the DGX SuperPOD. AI400X appliances are compact and low-power storage platforms that provide incredible raw performance with the use of NVMe drives for storage, and InfiniBand or Ethernet (including RoCE) as its network transport. Each appliance fully integrates the DDN EXAScaler parallel filesystem, which is optimized for AI and HPC applications. EXAScaler is based on the open-source Lustre filesystem with the addition of extensive performance, manageability, and reliability enhancements. Multiple appliances can be aggregated into a single filesystem to create very large namespaces. A turnkey solution validated at scale with the DGX SuperPOD, the AI400X appliance is backed by DDN global deployment and support services.