Fault Tolerance and High Availability


I/O makes or breaks the system, so storage performance is always a big deal, especially in virtualized environments, where VMs are hungry for IOPS. It’s extremely expensive to implement an all-Flash, even more so – all-RAM storage, both considered to be overkill. Thus, a combination of slower spindle, faster Flash and much faster RAM tiers is typically used in the industry.


Hardware failure is more of an issue for virtualized environment than it was for all-physical one, because one failing physical machine will bring down all the VMs it hosted. Every VM plays the role of an entire server, so such a failure would mean catastrophic service discontinuation. It becomes even more disastrous in case VDI and thin clients, because one hypervisor box going down would mean stopping a noticeable part of the company’s operations.

It’s essential to build hypervisor clusters to be fault-tolerant and fully redundant. Shared storage is an essential part of virtualization infrastructure, since it stores VMs of the given virtual environment, thus it must not in any case be the single point of failure.
Converged architecture has many drawbacks, including complexity, high cost and high I/O latency


To achieve fault-tolerance for storage subsystem, duplication or even triplication of all the critical components is used. In converged deployment scenario, StarWind runs virtual storage on multiple hypervisor nodes. In non-converged scenario, storage runs on many dedicated commodity servers. The shared Logical Unit is basically “mirrored” between the hosts, maintaining data integrity and continuous operation even if one or more nodes fail. Every active host acts as a storage controller and every Logical Unit has duplicated or triplicated data back-end.

Multi-Path nature makes sure that even if some I/O fail, the work will just continue instantly with zero downtime. This way 99.99% uptime is achieved with 2-way replica and 99.9999% with 3-way replica. Going beyond triplication is considered pointless for most cases, unless it’s a life-critical system, like a nuclear power plant reactor control or cruise missile guidance operations.
Hyperconverged architecture optimizes hardware usage and accelerates performance


StarWind Virtual SAN eliminates the single point of failure for storage in virtualized infrastructure by using duplication and triplication of data, caches and I/O controllers, basically “mirroring” them all between independent physically different hosts. This way, the virtual shared storage becomes fault-tolerant and provides high availability to higher performance and low-cost.

Request Online Demo

Related Materials

Request a Callback