After the recent organizational changes at Intel HPDD, the Lustre* community has been wondering what the future of the most utilized parallel file system would look like and apprehensive about potential disruptive changes that could affect Lustre development as well as potential outcomes for customers who have adopted the technology as their main parallel file system. Although the new Lustre development and adoption strategy still isn’t completely defined, it has turned out to be surprisingly simple, more clear, and consistent than anticipated.

Similar to the old Whamcloud days, Lustre development has returned to a single code stream, thereby avoiding confusion and lack of discernment regarding different distributions, features, capabilities, and source code differentiation. As per the well-received announcement made at the most recent LUG in Bloomington, IN, Intel is setting 2.10 as the LTS release of Lustre that should be the mainstream version for the next 18 to 24 months.

It is clear that Lustre is still strong and will continue to dominate the persistent parallel file system arena, at least for the next few years. The development of a such complex technology doesn’t flow as quickly as for many other applications, and even with the concept that parallel file systems may soon be replaced by other technologies, a gap would still exist for a few years until such other technology would be available. DDN® announced in November 2016 that all its Lustre features would be merged into the Lustre master branch to allow the entire community to have access to the code more transparently, reducing the overhead of code development management and better aligning with the new feature set coming up. Obviously, that decision was made much earlier than Intel’s announcement, showing DDN’s vision and trust in the foundations of open source development. Although numerous contributors and collaborators have asked why DDN would choose to share these patches rather than leverage them as a competitive advantage and differentiator, DDN has demonstrated that it is focused on delivering these features as a foundation framework coded into the Lustre file system. These features will then support DDN’s broader development which is now looking into areas such as security, performance, RAS, and data management.

Along with the recently announced features, DDN proposes a new, novel approach for Lustre’s policy engine (LiPE) that aims to reduce the installation and deployment complexity while delivering significantly faster results. LiPE relies on a set of components that allows the engine to scan Lustre MDTs quickly, create an in-memory mapping of the file system’s objects, and implement data management policies based on that mapped information. This approach initially allows users to define policies that trigger data automation via Lustre HSM hooks or external data management mechanisms. In the next stage of development, LiPE may be integrated with a File Heat Map mechanism for more automated and transparent data management, resulting in a better utilization of parallel storage infrastructure. (File Heat Map is another feature under development that will create a file mapping that weights the state object according to its utilization. For example, over time, the weight-unmodified files will decay, indicating the likelihood of such a file being a WORM-style file suitable for moving into a different disk tier.)

Regarding performance, DDN has designed and developed a new Quality of Service (QoS) approach. QoS based on the Token Bucket Filter algorithm has been implemented on the OST level that allows system administrators to define the maximum number of RPCs to be issued by a user/group or job ID to a given OST. Throttling performance provides I/O control and bandwidth reservation that can guarantee jobs with higher priority run in a more predictable time, avoiding performance variations due to I/O delays, for example. A new initiative among DDN and few renowned European Universities will investigate the implementation of a high-level tool, possibly at the user level, that would allow an easier utilization and configuration of QoS with a set of new usability enhancements.

Other interesting and long-time expected features from DDN that will be available on Lustre 2.10 and its minor releases during the LTS cycle include the Project Quotas facility, single-thread performance enhancements, and secured Lustre (MLS and isolation), among others. In keeping with new HPC trends, a tremendous amount of work has also been invested into the integration of Lustre with Linux container-based workloads, providing native Lustre file system capabilities within containers, support for new kernel and specialized Artificial Intelligence and Machine Learning appliances. Customers who are moving toward software-defined storage may be surprised to learn that, as part of its strategy regarding parallel file systems, DDN has also recently announced that it will support ZFS and Lustre as software-only.

To learn more on this topic, join our webinar on June 27, “Accelerating Lustre with DDN IME,” presented by James Coomer, technical director at DDN, and Laura Shepard, senior director of marketing at DDN. Register here.

  • Carlos Aoki Thomaz
  • Carlos Aoki Thomaz
  • DDN Senior Product Manager
  • Date: June 23, 2017