Hyper-V over SMB 3.0
The traditional method of storing a virtual machine is to create a logical unit (a LUN) in some physical storage system, such as a RAID array in the host or in a SAN, create an NTFS (Hyper-V) or VMFS (VMware) volume in that LUN, and to place the virtual machine’s files in that volume. If the LUN is created in a SAN, then the host requires a connection, typically provided by iSCSI or fiber channel. The key benefit of using shared storage, such as a SAN, is that virtual machine files are abstracted from the hosts; virtual machines execute on the hosts but are stored on a different tier of storage. Virtual machines are mobile and can be moved using Live Migration (planned) or failover (reactive).
With the release of Windows Server 2012, Microsoft upgraded their Server Message Block (SMB) file sharing protocol to support storing virtual machines on a shared folder on a file server (the host and file server must both be running Windows Server 2012 or later). This was unsupported and unthinkable prior to the release of SMB 3.0. SMB 3.0 added two features that enabled high throughput and low latency connectivity between Hyper-V hosts and virtual machine storage on of a file server:
- SMB Multichannel: When a host is connecting to a shared folder on a host, there is a discovery of mutually capable features. Part of this discovery process is to decide which connection(s) will be used between the host and the file server to access virtual machine files. If the host and file server share multiple common connections (1 Gbps or faster), then SMB Multichannel will automatically aggregate the bandwidth with fault tolerance – one can think of this as auto-configured multipath IO (MPIO). If SMB 3.0 discovers a higher capacity NIC (10 Gbps or faster) with Receive Side Scaling (RSS) enabled, then it can use the full bandwidth of that NIC and can even spread this load over multiple similar connections.
- SMB Direct: SMB 3.0 can offload processing of data flow to NICs that offer support for Remote Direct Memory Access (RDMA, offered in iWARP, ROCE, and Infiniband hardware). SMB Direct reduces latency and improves the performance of hosts and file servers.
SMB 3.0 was improved in Windows Server 2012 R2 and will see further improvements in the next version of Windows Server, scheduled for release in H2 of 2015. SMB is Microsoft’s strategic protocol for data transmission in the data center, and we have seen Microsoft adopt it for other purposes such as high speed Hyper-V Live Migration and storage-based replication.
Scale Out File Server
The Scale-Out File Server, or SOFS, is a relatively new architecture that provides software-defined, transparent failover (fault tolerant) and scalable storage. A SOFS is a cluster of file servers that share some common storage. The SOFS appears on the network as a single file server that shares the physical storage as file shares with Hyper-V hosts over the SMB 3.0 protocol.
A SOFS is built on off-the shelf hardware. Benefits of this approach are:
- Lower costs: Businesses of all kinds struggle with the cost of storage. SOFS offers you a choice of individual components, and this allows customers to build lower cost shared storage solutions without the constraints of vendor lock-in.
- More choice: Vendor lock-in restricts choice, and this can preclude some businesses from adopting best of breed components. A software-defined storage solution opens up a world of opportunity, letting the architect pick from a wider variety of components (servers, NICs, HDDs, PCIe flash, SSDs, and so on) to build the storage platform.
- Problem avoidance: An unfortunate side-effect of vendor lock-in is that many customers have had problems when component drivers/firmware have not been maintained sufficiently; their customers have no choice but to live with the problem. A customer with an open hardware design has choice and can pick components from more responsible vendors.
The low cost per terabyte has made SOFS an appealing alternative to legacy SANs for organizations of all kinds including businesses with smaller budgets, organizations that struggle with data growth or virtual machine sprawl, and hosting companies that must be price competitive.
StarWind Virtual SAN
StarWind Software is an award winning company that has been providing software-defined storage solutions since 2003, long before the term “software defined” became trendy in the IT industry. Their flagship product is StarWind Virtual SAN, which enables a business to deploy a number of physical servers and to unify their internal disks as a unified block of replicated storage to Windows, Linux, UNIX, vSphere, or Hyper-V servers.
Possible scenarios based on StarWind Virtual SAN include:
- Shared iSCSI storage for physical servers
- Hyper-converged storage for Hyper-V hosts
- A tier of shared iSCSI storage for Hyper-V hosts
- A SOFS for Hyper-V hosts
The goal of StarWind Virtual SAN is to make high-end storage features available in a fault-tolerant and scalable software-based solution that is affordable for small, medium, and large enterprises. These innovative features improve performance, availability, and reduce the costs of Hyper-V storage.
Log Structured File System
In virtualization, there are normally lots of virtual machine siles stored on a single LUN or volume. The applications in each virtual machine generate a lot of small parallel random writes – this is the kind of storage activity that storage systems struggle to deal with. StarWind uses a combination of RAM and flash storage as a multi-layered cache to coalesce many smaller writes into one larger write. This Log Structuring process in the Log Structure File System can improve raw sequential writes by 90%, many times more than competing solutions can offer.
StarWind Virtual SAN can combine three tiers of storage to cost-effectively improve the performance of storage:
RAM is used as a level 1 cache and write buffer; this offers the best possible performance for reads and writes. In-line deduplication and compression also reduce storage space utilization. These combined systems protect level 2 (PCIe or SAS) flash storage from huge amount of write activity, and this extends the life of costly flash storage. Using RAM and flash for the hot working set data allows colder data to sit on affordable and lower speed/cost HDD storage, thus enabling a business to get the right balance between cost and performance.
StarWind Virtual SAN runs on top of Windows, and this abstracts the storage system from the physical hardware. A StarWind customer can choose any server vendor and use any hardware components that are compatible with their servers. This allows customers to deploy best of breed or more economical solutions.
In-Line Deduplication and Compression
Two techniques are used in combination to reduce space utilization of the shared storage and lengthen the life of expensive flash storage, both of which will minimize the costs of storage. In-line deduplication, based on 4K blocks, reduces the amount of data that is sent to the disks in a virtual SAN. Compression reduces the amount of space consumed by data that is actually stored on the disks (flash and HDD).
StarWind Virtual SAN will dramatically reduce the TCO of any Hyper-V deployment because it doesn’t require any hardware storage equipment, typically required, for example, by Clustered Storage Spaces. This means no iSCSI, FC, NAS or SAN. StarWind Virtual SAN starts with just two hosts already running Microsoft Windows or Hyper-V and literally no other hardware.
- Savings in OpEx and CapEx. StarWind Virtual SAN starts with a minimalistic configuration of 2 nodes – less than other solutions. It doesn’t require SAS JBODS, wiring and switches with stellar prices and utilizes inexpensive SATA and Ethernet instead.
- Uncompromised Performance. StarWind Virtual SAN uses proprietary sophisticated algorithms, namely Log-Structuring, In-line Deduplication and so on to keep performance at the top.
- Native to Windows. StarWind Virtual SAN is a native Windows application, which means that any system administrator can configure and run it in less than five minutes without special training. It seamlessly integrates with different management tools like SCVMM, SMI-S, PowerShell and others, supporting SMI-S, VMM, SMB3, iSCSI, SoFS, etc.
Hyper-V over SMB 3.0 with StarWind Virtual SAN
The term “Hyper-V over SMB 3.0” refers to a deployment scenario where Hyper-V hosts access virtual machines that are stored on a shared folder on a file server of some kind. The best practice is that this is a continuously available shared folder on a Scale-Out File Server or SOFS. A SOFS is a Windows Server Failover Cluster made up of between two and eight file servers with some common, cluster-supported storage.
The physical storage might be:
- SAS-attached JBOD (just a bunch of disks) trays
- PCI RAID
- Fiber channel, FCoE, iSCSI, or SAS attached SAN
One could deploy a Scale-Out File Server with a number of shared JBOD trays, as depicted below. In this illustration, two file servers are clustered and run a special Windows Server Failover Clustering role called the Scale-Out File Server. The servers are connected to shared JBOD trays using SAS cables; it is these trays that the file shares
(and hence the virtual machine files) will be stored on. Many Hyper-V hosts can connect to the Scale-Out File Server to access the shared folders that reside on the JBOD trays.
Alternatively, one can deploy a SOFS using StarWind Virtual SAN, as shown in the following diagram. SOFS further leverages the flexibility of software-defined storage by reducing the
A SOFS made using StarWind Virtual SAN
reliance on another tier of physical hardware. This solution reduces the cost of the SOFS by removing the entire SAS layer (host bus adapters, cables, and JBODs); instead, StarWind aggregates the internal storage of the SOFS cluster nodes to create shared cluster storage. The shared folders of the SOFS (and hence the virtual machine files) are stored on the disks in the servers.
The functionality of SOFS is also enhanced with the features of StarWind Virtual SAN.
This means that:
- Small write operations are coalesced into more efficient larger writes
- RAM can be used as high speed level 1 cache
- In-line deduplication will reduce writes to the necessary minimum
- Low-cost SSDs can be used without fear of burnout
- Compression will reduce the amount of physical capacity that is used