DDN BLOG

During the past few years we’ve been hearing that POSIX I/O is about to end due the non-conformity or design issues that does not fit into current modern application interfaces. The lack of better multi-threaded and parallel I/O techniques, not designed 30 years ago when POSIX came alive, it is currently one of the major challenges getting POSIX parallel file systems to scale-up.

Although it’s true and accepted that a new I/O programming model and standard must be generically available for HPC and technical applications in a “close to near” future (that mostly requires highly efficient I/O parallelism), POSIX still the “de-facto” standard for HPC and technical and commercial applications. And, it is believed to remain for numerous more years, until a possible move to other, more intelligent and parallel aware technologies, arises.

Having said that, it is natural to expect that new features still coming up to help improve the efficiency of parallel file systems to cope with the advance of existing applications.

With the recent acquisition and formation of Whamcloud, DDN committed with an aggressive Lustre file system roadmap aiming to simplify I/O workflows and make Lustre better utilized in fields outside the traditional HPC. What once was a sole characteristic of HPC I/O, it is now very common to find highly parallelized I/O in even commercial applications that explore fields of big data and analytics.

I remember about 18 years ago, benchmarking large data warehouses built on top of relational databases, and bragging about sustaining 6GB/s per parallel queries under Oracle 8. Currently, those performance goals can be easily reached by single node crawling queries on data lake architectures as well on POSIX file systems. In fact, we are seeing two orders of magnitude higher performance on large HPC running on Lustre, where 6GB/s can be achieved with a single I/O thread.

So, if performance isn’t the issue, why doesn’t everyone run Lustre on their I/O intensive environment?

The answer relies on the feasibility and complexity of utilization and deployment, data management requirements, RAS, and security. The new Lustre roadmap tackles these four basic principles and add features into Lustre code that will allow, even more, commercial applications to take advantage of Lustre performance without sacrificing the other aspects.

We aim to make Lustre easier to manage and deploy requiring very low skill set; one of the most concerned issues reported by customers. Exciting new features around how to deploy, manage and tune Lustre will start appearing this November at SC18 in Dallas. The simplification on how we deploy systems will be one of DDN’s announcements (more on that will be revealed just before the show). The strategy is aimed to be expanded over the entire product line from small to very large systems. The idea goes beyond that. Ability to build transparent tiers to take the most of hybrid systems, remote replication framework, synchronous replication, among many others are to be announced for the next couple months of development.

Another great announcement, that isn’t necessarily new, is the wide adoption of Cloud storage. Lustre runs currently on two major Cloud platform offerings (AWS and Azure) but it is now being revamped with adoption of Google Cloud platform as well some others (to be announced). Cloud is an extremely important strategy for the Lustre community, since many customers will benefit of having a remote Lustre file system sitting somewhere in the Cloud for some applications. The Cloud offering itself, delivers customers to physically segregate their workload, and in many cases, allows for better integration with their user eco-system. Also, that offering enables companies running specific purpose POSIX workflows to scale beyond the traditional single node storage or NFS based options mostly available on Cloud marketplaces.

For many years we’ve been emphasizing the need of a highly flexible security aspects on parallel file systems that allows customer to pick and choose what they need to build up a very secure parallel file system environment. Unlike other offerings, we took a modular, selective approach where features could be cherry picked and with some help, highly flexible and secure environments to be architected. We continue improving it and expanding some features beyond the traditional “very secure sites” to commercial and also Cloud based deployments. Currently, DDN has customers running multi-tenancy Lustre and is planning to expand it the secure capabilities.

For more details on Lustre, DDN and Whamcloud acquisition, roadmaps, or simply for a cup of coffee, visit us or register to schedule an appointment at DDN booth #3213 at SC18. We’ll also have demos and talks Google’s booth, DDN Users Group, and more!

  • Carlos Aoki Thomaz
  • Carlos Aoki Thomaz
  • DDN Senior Product Manager
  • Date: October 31, 2018