It’s no secret: Hadoop® is not an entirely plug-and-play ecosystem. With data being wrangled at so many stages to gain critical business insights, one of today’s challenges is how to store the data while introducing an easy-to-use and cost-efficient storage layer that will result in a successful Big Data strategy.
DDN solutions are engineered to scale cost-effectively from fewer than 100 TB to hundreds of petabytes. DDN’s integrated end-to-end storage solutions span high performance (Flash, HDD), Active Archive, Deep Archive, Cloud, and DR to deliver the best price/performance in the market with robust and feature-rich enterprise-quality storage that protects your most valuable asset—your data.
|Abdulrahman Alkhamees - Hadoop and General Analytics Use Cases
Abdulrahman Alkhamees presents how you can use Hadoop to accelerate HPC workflows and shares use cases in leveraging consolidated, scalable storage architectures for faster results.
With powerful, policy-driven automation and tiered storage management, organizations can leverage DDN solutions to create optimized tiered storage pools by grouping devices (flash, SSD, disks, or tape) based on performance, locality, or cost. Accessing object-using file interfaces (SMB/NFS/POSIX) and accessing file-using object interfaces (REST) helps legacy applications designed for files integrate seamlessly into the object world. Multi-protocol access for file and object in the same namespace (with common User ID management capability) allows supporting and hosting different types of data with multiple access options. It also optimizes various use cases and solution architectures resulting in better efficiency as well as cost savings.
Why should data be moved after it was ingested the first time? Why not ask questions immediately when the data lands on the data storage layer? DDN in-place analytics allows you to do just that and more as the data streams into your storage cluster.
DDN HADOOP STORAGE SOLUTIONS
Optimal Storage for Hadoop-Ready Storage GRIDScaler® file system provides flexible choices for Enterprise-grade data protection, availability features, and the performance of a parallel file system. Combined with the ability to scale out to dozens or even hundreds of petabytes while sustaining high mixed IO performance, GRIDScaler delivers complete data protection, data security, and data management options to supplement a powerful file storage solution.
- Reduces storage costs up to 90% with automatic policy-based storage tiering from flash through disk to tape
- Offers unified namespace through POSIX and HDFS protocols
- Supports services including HDFS, YARN, MapReduce, Pig, Hive/ Hiveserver2/HBase, Zookeeper, Flume, Spark, and Sqoop
- Isolates Hadoop clusters using single file system
- Tiers seamlessly in a single namespace from SSD to HDD, object storage, tape, and cloud
- No changes on the existing Hadoop applications
- Offers multi-protocol support with native access: NFS, SMB, OpenStack Swift, and S3
- Supports a wide range of Hadoop distributions