I/O makes or breaks the system, so when it comes to IT infrastructure planning, it is imperative to keep the throughput balance of all the components. The overall system performance will always be approximately equal to the performance of its slowest part. Thus, even one slow component will bring down an uber-fast system composed of high-end hardware. Such a problem is called “bottleneck”, meaning there is a “narrow end”, which slows down the whole process. At the same time, one ultraperforming component will not make a difference in an otherwise mediocre system. That is why balancing throughput of components is the key to cost-efficiency of the setup.
Interconnect fabrics are typically much slower than high-speed data storage like flash and especially DRAM. This makes them the “bottleneck” of a system, which has great internal data processing capabilities, because no matter how fast the I/O is, the data still has to go through the slower wire. The issue negates all the benefits of fast storage like ultra-performing all-flash setups and greatly reduces cost-efficiency of the IT infrastructure – there is no way to get the maximum I/O speed when data has to go through fabrics.
The I/O requests has to go through the wire when application data is scattered and not stored locally
Processing as much of the I/O as possible locally helps avoid the slow data transfer through interconnect fabric and achieve the highest performance the setup can provide. This approach is called “data locality” and it is utilized in StarWind appliances to maximize performance. The idea is in keeping the compute and storage resources for every virtual machine on the same physical node. This way the VM data will rarely have to go through the slow wire and the performance will increase.
Local I/O is much faster, so performance and latency are better in this case
“Data locality” effectively solves the problem of slower interconnect fabrics being the “bottleneck” of the IT infrastructure. It keeps most of the I/O for each process locally within the boundaries on its physical node and ensures much better performance than typical multi-node configurations, where compute and storage resources for one process may be located on different servers. Additionally, “data locality” provides lower latency, because there is no network stack processing overhead, as much less data has to be transferred.