Both the GreyBeards on Storage and the Storage Unpacked podcasts have done interviews with Weka.IO in the past, and I listened attentively. I had a customer looking to replace their existing infrastructure, based purely on opensource architecture, and the goal was for something incredibly quick but far more supportable model, and from the two podcasts, plus the research I did, I was very happy to recommend them to my customer.
Parallel file systems are complex, with all sorts of critical aspects. Resiliency, scalability, and ease of deployment among them. Happy to say that Weka.IO achieves all of these requirements with aplomb.
At #SFD18 (Storage Field Day #18) we were treated to a presentation by these folks of this revolutionary new approach to a fully optimized, well architected parallel file system. Needless to say, I was quite hopeful that what I saw was as well architected, and resilient as I had been led to believe. The truth was, I was even more impressed than I thought I would be.
Unlike previous incarnations of a system in this vein, I saw the architecture in a new way. For example, the environment can be deployed in two ways, as either a converged architecture or as a dedicated appliance. Built on a “Reference Architecture” but with “Commodity Off the Shelf” (COTS) model, the customer has flexibility on their approach. They can buy a pre-built system, or they can build to specification based on their already defined server architecture.
It’s a fully tiered model, including a flash tier, and a spinning disc tier. Incorporating massively scale-out design, leveraging single namespace functionality. It supports Object storage functionality and replication, and supports replication, and snapshotting on cloud locales for DR. When a rehydration is required, that process actually recreates the file system as well.
One of the key details with this architecture is that speed is enhanced by evenly distributing the metadata across all nodes, making it a perfect solution for Seismic, Genomics, Financials, Trading, Petrochemicals, Biomedical and even Visual Effects types of workloads.
Interestingly, the so very lean Operating system, and file system often completes jobs before the job gets written to the disc itself. And, the functionality/resilience is so strong that in a demo we saw, one node of a sixteen-node cluster was downed in an assumed Abend (abnormal end) while tracking performance, and in this, the job transaction completed in even less time than anticipated. The abending of a second node showed a degradation of redundancy, but performances (read/write throughput) lost nothing.
Selling through partnerships, their growth over the past year has been profound, and the future of this incredible POSIX architecture seems to me to be following an ideal critical path.
Note: I was fortunate to have been included in a Storage Field Day event, wherein travel and lodging expenses were paid for, in the Santa Clara area of California. A number of my postings as of late have been made possible by the attendance at these events, and I hope that my performance within these presentations brought the on-line audience answers to questions they may have asked if given the chance.
Lovely blog youu have