Like ‘big data’ before it, the term ‘convergence’ is being appropriated by many vendors and press alike to describe very different things. A little disambiguation is in order.
Wikipedia gives us all a nice, neutral definition that can reign the conversation back in and remind us what convergence – in this case storage convergence- is; i.e. something to describe a consolidation of server, network, storage elements into something that deliver benefits in one or more of the areas of manageability, cost and performance.
“Converged storage is a storage architecture that combines storage and compute into a single entity. This can result in the development of platforms for server centric, storage centric or hybrid workloads where applications and data come together to improve application performance and delivery.” Wikipedia
By this seemingly reasonable definition, storage convergence can be different things to different people – and it is. Enterprise users and vendors tend to focus more on storage convergence approaches that will give them the most cost and administrative benefit. While cost and administrative benefits are important to HPC users too, their focus is more on the performance benefits of lower data access latency associated with having data local to computing resources.
Storage convergence can be achieved through any combination of traditionally separate parts of the IO Infrastrucutre This is why DDN developed In-Storage Processing™ technology and started making it available several years ago for customers who wanted to bring data-intensive applications inside the storage – to accelerate data access, minimize latency and significantly lower the cost and complexity of data-intensive computing. At first this was a very custom approach with limited availability, but as demand has grown, In-Storage Processing has been productized in the company’s storage arrays (the SFA family), and is even available with some applications embedded at the factory – like Lustre or GPFS parallel file systems – and some available for embedding via professional services – here, the current favorite is to embed iRODS – an open source data grid solution for managing large sets of computer files – enabling anyone involved in a research project to view, manage, access, add and share data simply and securely across organizations and multi-vendor environments.
If you want to take a look at real live implementation examples of storage convergence for HPC workflows, check out University of Florida, the National Energy Research Scientific Computing Center (NERSC) and the International Centre for Radio Astronomy’s Square Kilometer Array (SKA) telescope. In all of these cases a converged storage approach is not just a cost savings or administrative measure, but a means to achieving a higher return on science – making fundamental discoveries about human health, fundamental physical systems and the universe, faster.