Lustre is moving fast – adding features and function which make it more consumable across a wider variety of sites and use cases. As Lustre’s popularity grows, so does the tension between users who want it to be easier to consume and those who rely on it to get bigger and faster. DDN has a foot in both camps. The longtime leader in sales and support of Lustre environments at scale, DDN is also the #1 reseller of the enterprise-focused Intel® Enterprise Edition for Lustre* Software with our EXAScaler solution.
To serve our customers in both segments, we have been focusing on performance, stability and management features in our upcoming EXAScaler 3.0 release and in our open source contributions.
EXAScaler 3.0, which is expected out later this year, includes core Lustre updates as well as DDN-only capabilities. It takes the 2.7 core from Intel® Enterprise Edition for Lustre 3.0 – for which DDN was the developer or lead test organization for every major feature – and adds several significant performance, availability and management features within DDN’s SFAOS as well.
The common theme across the majority of the DDN-developed Lustre features, open source contributions, and new SFAOS features are improving the systems needed to scale Lustre and to manage it at scale—and continue to advance the file systems capabilities along the new technologies introduced on the industry.
EXAScaler 3.0 brings all the new features and capabilities coordinated and managed by Intel on its Enterprise Edition for Lustre, but also adds unique and exclusive features such as high availability improvements, metadata scaling, project quotas back-end, and support for differentiated storage services framework, to name a few.
DDN is committed to align features and product capabilities with the most recent industry standard technologies. The release of EXAScaler 3.0 brings a Lustre version fully qualified to run on the most advanced DDN storage, SFA14K™ family (both block and embedded versions) with support for state of the art network technologies such as Intel® OmniPath architecture, Infiniband EDR native and 100Gbit Ethernet mode, and the latest OFED versions and Linux operating systems. SFA14K features improve performance of block and embedded storage appliances, bringing new reliability features like on-line updates for converged appliances, improved SSD cache layer and out of band hinting.
DDN is a leading Lustre supporter. We constantly contribute to the Lustre code with feature developments and code reviews, and also share Lustre code and tools with the open source community to better support customers’ needs. The EXAScaler 3.0 release also delivers several new features, mostly exclusive to EXAScaler users. One of the highlights of this version is the ability to differentiate an I/O request and based on the type of the requisition Lustre will send advises/hints from clients to server. “Ladvise” enables applications and users to intervene in the cache management (e.g., pinning objects to the cache layer or instructing to bypass cache). This feature enables customers to move forward on the SSD and HDD /Rotational spindle “tiering” capabilities that accelerate and optimize the storage of different data patterns. The feature is implemented as user level command under the “lfs” Lustre facility, and can be also integrated with the SFX 1.1 out-of-band hint mechanisms for a more advanced and completed cache management solution.
DDN has a long-term project to introduce changes on the ext4-OSD backend that will allow customers to implement project quotas efficiently. Currently, Lustre relies on standard UNIX POSIX attributes that are not robust enough to allow for management of quotas on large deployments and multi-organizational implementations. The project quota support has been widely discussed among community developers, and several ideas have been proposed and discussed for the last few years, which have shaped the work. Now, DDN has implemented the ext4-OSD backend capable to handle project quotas, that combined with OST pools, will enable a wide variety of quota controls other than just UID/GID quotas (standard on ext4).
Due the complexity of this development, DDN decided to split the effort into two major development milestones. The first one is to re-define the project quota layer on the OSS and MDS level. This step required significant changes on the ext4-OSD layer and it is available for customers using EXAScaler 3.0. The second milestone is the user level project quota facility that will be available in the EXAScaler 3.1 release early next year. Customers willing to upgrade today to EXAScaler 3.0 will have an ext4-OSD backend ready for the Project Quota implementation in the near future (when EXAScaler 3.1 is released) and will not need to re-format their entire file system. The support provided on EXAScaler 3.0 enables a smooth transition.
Another set of changes being introduced in EXAScaler 3.0 is in security where kerberized and multi-tenant support for Lustre has been added. Sounds complex? It is! This is a very complex and large feature set that requires major efforts in development, quality assurance, compliance testing and certification, from the adopted OS kernel level on the servers up to the user’s tools on the Lustre clients. Due the extensiveness of these requirement, DDN is working together with Intel and the rest of the Lustre community to deliver functionality in a phased approach where several software modules and components are expected to be released at a time, including in the next versions of Lustre.
The first of these are delivered in EXAScaler 3.0, where DDN is introducing a “Subdirectory mounts” feature. Subdirectory mounts allows customers to mount Lustre subdirectories instead of the entire namespace. By mounting a Lustre subdirectory instead of the entire namespace, a sys admin can segregate access per Lustre client type, thus preserving the security level of a given subdirectory to one or a set of privileged clients. For example, a single Lustre namespace could have two major subdirectories, one for “Department A” and another one for “Department B.” So, clients that are managed and provisioned by “Department A” would only mount its subdirectory and the same for “Department B.” Doing this these two set of computer clients will not have access to, or even visibility of each other’s departmental data. This can be also integrated to the standard DNE phase I feature, where a single MDT could be assigned for each of departments root directory isolating not only the file access but also metadata performance. While the option to mount the entire namespace remains, this feature will be necessary to implement multi-tenant environment which is so often a pre-requisite for large organizations looking for a secure file system implementation.
In order to maintain and keep an active participation in conjunction with the Lustre development and user community, DDN has decided to start an open source project and share some of its tools and Lustre developments. DDN understand Lustre moves fast and dynamic as it is, requiring feedback from and participation of multiple sources (customers, partners, non-customers). With that in mind, we’ve decided to share on an open source basis, some of the projects we are currently working. That initiative includes tools for Lustre developers, such as the “Lustre Automatic Test System” that will help developers to run regression tests in parallel on hundreds of test nodes; a Lustre monitoring framework platform based on industry standards such as Collectd, elastic search capabilities on top of NoSQL database and Hadoop, and visualization on Grafana; Parallel copy tools and internal Lustre mechanisms that will move and copy data between OSTs and across namespaces more effectively and in parallel; DDN is also working with a major OS distribution to provide in long term an alternative OSD backend based on BTRFS.
The towering strength of the Lustre community comes from its members’ deep participation. The more DDN develops and contributes, the more guidance we get from end-users, partner and even competing vendors. Knowing what the community lets DDN define better and more use case scenarios which, in turn, improves and accelerates our own product development as lets us pay our way by contributing back solid and useable tools and framework.