My colleague Michael Letschin @mletschin wrote this post for our corporate blog, and asked me to provide a bit of color and a bit of editing. I was honored to have even a small part of this educated perspective.
The latest in storage trends today rely on flash or solid state storage, which seems a great method to speed up applications, speed up virtual machines and overall, to make your system run faster; but what happens when that same fast SSD fails? Jon Toigo does a good job explaining SSD failures here – http://goo.gl/zDXd2T. The failure rate on SSD, because of the cell technology involved is of huge concern. Many companies have looked for ways to solve this: from adding in wear leveling, or cell care, or even adding capacity that is not advertised just to have cells to put data writes that are new, while deleting the old blocks in the background. This is completely dependent on the drive manufacturer to save your disk.
Now the important question – Did you get a choice when you bought your enterprise storage as to what manufacturer’s SSD were your “fast” drives? Unlikely, and, without it you wouldn’t know if your drives will be the fast rabbit that never slows down to win the race, or the one that stops on the side of the road which could easily be overtaken by the tortoise.
This is a situation in which using a ZFS based file system like Nexenta could help not only solve, but let you know exactly what you have, and how you need to manage the life span and speed of your enterprise class storage. Nexenta is based on the ZFS file system, and uses commodity drive solutions, so the first problem of not knowing what drive you have is instantly solved, because you can now use best of breed, SSD or flash, and replace them as newer technology arises.
The real secret sauce comes into play when you combine the best in class SSD protection with a file system built to optimizethe usage of the SSD, opting to use DRAM as much as possible and isolate the read and writes needed for normal usage. ZFS utilizes the hybrid storage pool for all data storage. ZFS inherently separates the read and write cache, each using its own SSD such that it should be selected specifically for the use case. SSD wear is more commonly known for write operations, in ZFS, the ZIL, or ZFS Intention Log handles this.
For ZIL drives it is recommended to use SLC (Single Layer Cell) drives or RamSSD, like ZeusRAM. SLC drives have a much lower wear rate. To see an analysis of different types of SSD look here – http://goo.gl/vE87s. Only synchronous writes are written to the ZIL, and only after they are first written to the ARC, (Adaptive Replacement Cache) or the server’s DRAM. Once data blocks are written to the ZIL a response is sent to the client, and data is asynchronously written to the spinning disk. The writes from the client are not the only SSD writes done. When using a tiered storage methodology, blocks of data must be written to the read cache prior to being read. This is the case with ZFS and hybrid storage pools also, however the differentiator is how often the blocks are written to the L2ARC, Layer 2 Adaptive Replacement Cache. The L2ARC is normally placed on MLC or eMLC SSD drives and is the second place that the system looks to find data blocks that are commonly used after the ARC/DRAM. It is not uncommon for other files systems to use a similar approach, however they use least recently used (LRU) algorithm. This does not account for if the data blocks may be used on a frequent basis but a large data read is done, from a backup for instance, and the blocks are shifted. The algorithm used for ARC and L2ARC accounts for these blocks and maintains data blocks based on both most recently as well as most frequently used data. Specifics are found here – http://goo.gl/tIlZSv.
The way that data is moved from and to SSD with ZIL and L2ARC is impactful not just for the wear time on the SSD but also on power consumption, that is paramount in the datacenter of the future. Using this approach allows systems to be built using all SSD footprints and minimal power, or even slower drives for the large capacity, while maintaining high level performance.
In many ways, the tortoise and hare analogy plays well in what we’ve been discussing. Leveraging the power of SSD, and the proper ones at that, will allow you to place the sheer power and lean nature of the Hare, while employing the economy and efficiency of the Tortoise. This, in a way, is the nature of ZFS. Power and economy wrapped up in one neat package. The real magic comes from the ability to tune the system either upward or down in an effort to get it to perform just the way you’d like it to. This is easily done simply by adding or removing SSD to the mix either in ZIL or L2Arc capacities.
There’s nothing inherently wrong with being a tortoise, but, to be a hare, well designed, and performing at peak efficiency, but also enduring for the entire race, really seems like the best way to go, doesn’t it?