NVMe is one of the hottest topics in the world of storage these days. Expectations for this technology are so high that 2019 is sometimes called a year of NVMe. Nevertheless, PCIe SSDs are still too expensive for SMB and ROBO to build all-NVMe storage infrastructures. To ensure that users fully benefit from this technology, with minimal hardware footprint, StarWind introduces a new protocol for StarWind Virtual SAN: NVMe over Fabrics (NVMe-oF).
Nowadays, as flash becomes increasingly prevalent, adding one or two NVMe drives to a cluster seems a really good idea, especially, if you run some IOPS-hungry applications in it. Unfortunately, even with those drives on board, the resulting infrastructure performance will still fall short from satisfactory because PCIe SSDs cannot be presented efficiently over the network. Especially, smart hardware utilization is critical for SMB and ROBO that are often on snag budgets for IT projects (that’s why adding an NVMe drive to the cluster is often a big deal for them). Wondering why most environments just cannot access a good part of PCIe SSD performance over the network? The answer is quite straightforward: legacy protocols.
The traditional protocols like iSCSI, iSER, SMB3, and NFS were designed to talk to slow storage media, not flash. Their single short command queue limits NVMe drive I/O so badly that applications do not get a good part of the underlying storage performance. Of course, you’ll still get a performance boost after adding PCIe SSDs to the cluster, but the overall VM performance will be only 20% higher than on spindle drives! Let’s face it, it is just a mere fraction of the performance that NVMe SSDs can provide.
Serial Attached SCSI (SAS) – Single short command queue is a performance bottleneck
Are there any alternatives to iSCSI-derived protocols? Yes, there is one tailored to achieve the peak NVMe drives’ performance – NVMe-oF. The typical for legacy protocols single short command queue is replaced with 64 thousand command queues, 64 thousand commands each. Such design enables to reduce remarkably the latency while NVMe SSDs are presented over the network, allowing to get all the IOPS that they can provide.
NVMe-oF – Networking is not a performance bottleneck anymore
The problem is that there are no industry-standard hypervisors which feature native NVMe-oF support. As a result, admins are locked-in to the legacy protocols that are proven to be inefficient for PCIe SSDs. So, taking into account the current state of IT, another challenge these days is bringing NVMe-oF to all hypervisors. For that very purpose, we added its support to StarWind Virtual SAN.
How successful is StarWind’s implementation of NVMe-oF? Let the numbers talk! We obtained over 2M IOPS on 4 Intel Optane SSD 900P drives in a bare-metal environment while the latency was only 10 microseconds higher than in Intel’s datasheet. It should also be noted that we changed the way how hypervisors talk to NVMe drives. That’s actually why we got slightly more IOPS than Intel claimed in its datasheet. All this being said, the StarWind’s NVMe-oF implementation almost eliminates any difference between the locally connected PCIe SSDs and ones presented over the network.
When it comes to underlying storage performance, NVMe is the true king of the hill. However, it is still difficult to present PCIe SSDs over the network effectively since the traditional SCSI-based protocols do not work that good for flash. NVMe-oF is a protocol tailored for flash. StarWind introduces NVMe-oF support to StarWind Virtual SAN so that, from now on, everybody can run applications at full throttle on NVMe drives regardless of hypervisor.