npools

Tom Haynes wrote a great article about pNFS: How does a npool compare to a zpool?. He plays in this article with the concept of npools als storage pools consisting out of several servers instead of zpools as storage pools consisting out of disks at a server. Really an interesting idea. The article led me to another thought: Technology in IT is a thing of waves and cycles: At first there were file level storage area networks with the first versions of NFS, then there were block level SAN networks like FC. And in it´s cycle there were waves … albeit there were predominat technologies (there was a time everybody got laughed for the idea to use NFS, but that was at large a problem introduced by sub-standard NFS implementation in a certain open-source implementation of NFS), and everybody wanted block devices for everything … now there is a problem that the number of block devices often outnumbers the servers by an order of magnitude leading to management problems like fragmented storage. This led to additional complexity with thin provisioning, not working well with the space allocation patterns of data. And the whole idea of virtualisation in storage is just an added layer of complexity to solve another complexity that doesn´t goes away, it´s just a hidden skeleton in the closet. Perhaps pNFS is the next wave of file level in the datacenter. There are several good reasons to do so. With virtualisation of desktops and servers there isn´t a physical limit preventing people to generate small servers for each and every task. When you try to do this, the number of block devices simply gets outrageous. In discussion with customers there is a trend for providing storage devices to virtualisation layers in the form of files in a shared filesystem provided by NFS, even when there is a performance penalty. Simply because it´s easier to manage. The concept of mirrored stripes in pNFS may even render the whole concept of making storage high-available itself superfluous, as pNFS in conjunction with the meta data server may solve many of the problems. Of course this puts a big load to the storage servers. But with pNFS you have really interesting possibibilies to share all the load generated by the virtual disks over smaller systems: Like building stripes of whole servers. Like having a real shared access instead having to fiddle around with many configurations while moving a LUN from one server to another. This shows another basic concept of IT. Any given problem in HPC will appear in commercial IT noch much later. Any given technology in HPC will appear in commercial IT not much later. pNFS and similar developmets of cluster based data storage were developed for storing large amount of data for HPC. But virtualisation for example shares many of its problems with large compute clusters. Thus anybody in commercial IT should have a close look on HPC. Solutions and problems found in that area will help or hit them two or three years later. Better to be prepared and informed. pNFS is definitely a technology from HPC that will help you in the future solving your commercial IT problems.