This blog shares the results of a joint Proof of Concept (POC) between Oracle and DDN, highlighting the architecture and performance outcomes that demonstrate how the DDN Infinia solution on OCI IaaS Compute enables a new class of high-performance AI workloads on OCI.

As AI workloads mature from training-focused experiments to real-time, production-scale systems, enterprises start to look for infrastructure that meets the performance demands of each stage — from model training to retrieval-augmented generation (RAG) and real-time inference.

S3-compatible storage has emerged as a standard interface for modern AI platforms, thanks to its flexibility and cloud-native integration. However, workloads like RAG, interactive inferencing, and agentic AI often require high-concurrency, low-latency access to object data — pushing the performance envelope further.

Oracle Cloud Infrastructure (OCI) and DDN Infinia are coming together to address these demands. Infinia is a software-defined, S3-compatible KV store optimized for GPU-driven AI, delivering high IOPS, low latency, and scalable throughput. DDN Infinia offers:

Extreme S3 Performance: Ultra-low latency, high throughput, and millions of IOPS (Obj/sec)
Unified Data Access: Connects multi-modal data across environments without silos
Massive Scale: Supports 200,000+ GPUs and exabyte-scale datasets
Native Multi-Tenancy: Isolate and manage workloads with QoS and dynamic scaling

Deployment Architecture

This POC was set out to demonstrate the scalability and performance of DDN Infinia software-defined storage running on OCI bare metal servers to support high-throughput and low latency S3 workloads for AI/ML, media, & HPC. We recommend using BM.DenseIO.E4 and BM.DenseIO.E5 as server nodes.

Benchmark Setup

Server Nodes: 6 x OCI BM.DenseIO.E5

The Infinia server component was deployed on an OCI BM.DenseIO.E5 bare metal instance, coming with: 2× AMD EPYC 9J14 CPUs (128 OCPUs total), 1.5 TB RAM, 12 × 6.8 TB NVMe SSDs, and a single 100 GbE high-speed networking interface.

Figure 2: DDN Infinia POC Setup at OCI – Server Config

DDN Infinia was deployed across six BM.DenseIO.E5 nodes, forming a single logical storage cluster that exposed a unified S3-compatible namespace. We recommend minimum of six server nodes the initial Infinia storage configuration for high-availability as well as no interruptions to data access in the event of one or more server node being taken offline (due to software/hardware maintenance). The setup provided ~450 TB of pre-protection usable capacity with full support for erasure coding, metadata indexing, and distributed high-performance S3 object access.

Client Nodes: 6 x BM.Standard.E5.192

The S3 client nodes were provisioned on OCI BM.Standard.E5.192 shapes to simulate AI/ML applications accessing DDN Infinia storage via S3 protocol. Each node included: 2× 96-core AMD EPYC CPUs (192 OCPUs), 2.2 TB RAM, and a single 100 GbE high-speed networking interface.

Client nodes generated high concurrency GET/PUT workloads using industry tools like AWS cli, s5cmd, and warp benchmark, emulating AI applications performing inference, RAG, and real-time streaming I/O.

Performance Results

S3 Metadata and Data Performance

The S3 performance testing focused on Obj/sec for small objects as well as metadata operations, throughput for large objects, latency as well as Time-to-First-Byte (TTFB) for the S3 Object access.

S3 Operation	Obj/sec	Throughput	Latency
PUT	52 K/s	27.6 GiB/s	4 milli-seconds
GET	225 K/s	34.6 GiB/s	1.7 milli-seconds

S3 Metadata Operation	Obj/sec
LIST	194 K/s
STAT	345 K/s

Time-to-First-Byte

5 milli-seconds

Table 1: Aggregate DDN Infinia S3 performance on OCI across the six BM.DenseIO.E5

Now we compare the performance of DDN Infinia on OCI using the below S3 tools / benchmarks:.

Warp: This is built for speed—it runs tons of parallel requests to max out the network and hits close to the theoretical limit of the 100 Gbps NIC of the BM.DenseIO.E5. Thus, we see 10.6 GiB/s throughput with Warp.

AWS CLI cp: This is written in Python and isn’t really optimized for high-speed transfers or heavy parallelism. It just can’t keep up with the hardware, so the best it managed was 2.2 GiB/s.

s5cmd cp: This is written in Go, which handles concurrency much better than Python, so it’s able to push more data in parallel and get better performance than AWS CLI cp, but it still doesn’t quite reach what Warp can do. This lands in the middle at 4.7 GiB/s.

The more optimized the tool is for parallelism and efficient data transfer, the closer you get to saturating your network link. That’s why we see such a big gap between these results.

S3 Tool/Benchmark	Single S3 Client to Single S3 Server Throughput
warp benchmark	10.6 GiB/s
AWS CLI cp	2.2 GiB/s
s5cmd cp	4.7 GiB/s

Table 2: Single S3 Client to Single S3 Server DDN Infinia throughput on OCI

On OCI, DDN Infinia consistently delivered:

Sustained High Obj/sec (IOPS),
Sustained High Throughput,
Low latency and TTFB in the lower single-digit milli-seconds, and
The Storage performance scales with scale-out OCI server node counts.

Performance Scalability

The Infinia S3 performance as well as scalability was assessed using the popular warp benchmark. DDN Infinia S3 throughput scales with the scale-out OCI server node counts (Figure 3) until the saturation of the components constituting the hardware stack occurs (100 GbE high-speed networking interface in this case). The aggregate S3 throughput is balanced across the participating scale-out OCI server nodes with each OCI server node sustaining ~4.6 GiB/s for PUT and ~5.7 GiB/s for GET operation.

Figure 3: DDN Infinia Scale-Out OCI Setup – S3 Throughput Performance Scaling

DDN Infinia’s S3 Metadata and Data Obj/sec performance scales with scale-out OCI server nodes (Figure 4). The aggregate S3 metadata and data Obj/sec performance is balanced across the participating scale-out OCI server nodes.

Figure 4: DDN Infinia Scale-Out OCI Setup – S3 Obj/sec Performance Scaling

Conclusion

The joint POC between Oracle and DDN confirmed that DDN Infinia on OCI delivers linear scalability, high S3 throughput, ultra-low latency, and massive concurrency. Together, Oracle and DDN provide a cloud-native stack built for:

Faster Time-to-Insight, Higher GPU Efficiency – By eliminating I/O bottlenecks and delivering sustained high-performance object access, Infinia ensures that GPUs and CPUs on OCI remain fully utilized. This translates into shorter training times, faster AI inference, and lower cost per inference. (what can we say about tiering – Weka is talking about it).[JB1]

High-Performance Scalable Data Storage – DDN Infinia’s low latency, low TTFB, scaling and high metadata Obj/sec (IOPS), as well as scaling and high data throughput can serve as ideal storage for high-performance applications, and accelerate AI LLM inference as well as RAG workloads. DDN Infinia’s lower latency and TTFB values enables faster access to the S3 objects enabling faster AI token generation as well as faster access to contents for streaming providers in the edge and CDN caching devices (to alleviate burst of user requests).

Optimized for OCI’s High-Performance Architecture – Running on OCI’s dense bare metal compute and 100 GbE network fabric, Infinia takes the full advantage of Oracle’s infrastructure — delivering consistent, stable, reliable performance at scale.

Multi-Tenant Ready with Rich Metadata – With native multi-tenancy, namespace isolation, and millions of tags, Infinia on OCI allows teams to manage AI datasets intelligently — supporting shared infrastructure without added complexity.

Low Infrastructure Cost on OCI – The OCI BM.DenseIO.E5 shape delivers the most cost-effective[1] NVMe storage, offering the lowest cost per TB and sustaining cost-efficiency as capacity requirements grow.

Last Updated

Jul 21, 2025 8:19 AM