Virtual machine storage is one of the key components of the virtualized infrastructure with block and file storage being the most used storage types. Choosing either will impact the performance of VMs, apps and services you run. That is why it’s particularly important to ensure your VM storage is well-matched to their requirements. So, in this article, we will highlight the difference between block and file storage and describe most notable protocols.
In file storage, data is stored in files in a hierarchical structure, organized into directories and sub-directories. To retrieve a file from storage system, you need access permissions and the file path. File protocols enable NAS scenarios where multiple clients connect to a server, allowing for easy file sharing and collaboration.
Most popular file (NAS) protocols are NFS (Network File System) and SMB (Server Message Block). Virtualization-wise, the NFS-based storage is commonly used with VMware, KVM and Xen, while SMB is the preferred option for Hyper-V environments.
Overall, file storage works best for small to medium-sized unstructured data sets, such as text documents, pictures, media, and other popular content types. However, latest file storage protocol implementations are also good for serving as a virtual machine storage, which is quite useful for those Admins who don’t want to add block storage (SAN) into the mix.
On the other hand, Block storage is commonly used for storing virtual machine files and large structured data sets such as databases. In block storage, the file is divided into multiple blocks, and each block has a unique ID. Blocks can be stored on different disk drives or even different systems connected by network. When that data is requested, the blocks are retrieved by their ID and reassembled.
Block storage is commonly used with SAN (Storage Area Network) where the most popular protocols are Fibre Channel, iSCSI, and recently NVMe-oF (NVMe over Fabrics).
Comparing SMB, NFS, iSCSI and NVMe-oF
Now, let us look at the differences and similarities between four popular file- and block-level protocols available in most virtualized environments:
SMB or Server Message Block is a file-level storage protocol. It allows the client to read and write data from a file server in a network. SMB is known for its simplicity and ease of use. It’s also highly compatible, working well with various operating systems. Windows Server services, such as Microsoft Hyper-V or even Microsoft SQL Server, can use SMB to store their data. The SMB 3.0 version available from Windows Server 2012 came with new features such as SMB Multichannel and SMB Direct that significantly enhanced the protocol performance and expanded its use cases.
Performance wise the SMB is tuned for large sequential reads/writes, which is what “casual” users do most of the time. It is also great for storing unstructured data and even virtual machines in Windows-based environments with native SMB 3.0 and its advanced features. That is why small businesses prefer it for virtualization and file sharing. However, SMB is not a great fit for large enterprises and high-performance environments due to bandwidth limitations. And it is not preferable when transferring lots of small, kilobyte-sized files. Here, it is better to use block-level storage such as iSCSI.
Network File System (NFS) is a distributed file system protocol that enables file sharing and remote access to files over a network. It also operates on a client-server model, allowing users to access files stored on remote servers as if they were kept in a local folder.
Similar to SMB, NFS provides users with transparent access and file locking mechanisms, coming in handy when collaboration within a company is needed. Like SMB, NFS is also faster and more efficient for large sequential reads/writes and is less effective for small-sized I/O. The performance of NFS 4 can be further improved with advanced features such as RDMA and Multipathing. However, achieved with either session trunking or via pNFS extension, the implementation of multipathing in NFS is not as reliable and easy to use as in SMB.
Internet Small Computer System Interface or commonly known as iSCSI, is a block protocol that works over TCP. ISCSI allows to set up a shared storage network which makes it possible for multiple clients and services to access central storage.
Because iSCSI is a block protocol, the initiator transports block-level data from the server to the target on the storage device. It assembles the data in the form of packets for the TCP/IP layer by encapsulating SCSI commands. Once the packets arrive at their destination, they are separated into the various iSCSI commands for the OS to read the data as if the physical storage device was locally connected to the computer. The main problem with this is that iSCSI does not allow multiple servers to access the same volume simultaneously. However, this can be achieved with file systems that permit multiple simultaneous access, such as clustered file systems (CSV, VMFS, etc).
The iSCSI protocol is supported by most hypervisors and operating systems, and you can use your existing Ethernet equipment to deploy the iSCSI SAN infrastructure. Because there is no need to learn complicated fibre channel SAN topology, specialized hardware, or staff to deploy and maintain an iSCSI storage network are not required. Since iSCSI uses TCP/IP protocol, it technically supports up to 400Gbps Ethernet. This makes it a desirable choice over SMB for intensive workloads in enterprise environments.
As for drawbacks, iSCSI generates a tremendous amount of network traffic by its nature. This can be worked around by segregating iSCSI into a separate LAN segment while further improving speed through RDMA and various offloads. However, if it isn’t fine-tuned properly, there can be issues.
Non-Volatile Memory Express Over Fabrics, also well-known as NVMe-oF, is a modern high-speed storage protocol used to ensure fast and efficient data transfer between initiators and solid-state storage devices over Ethernet, Fibre Channel and InfiniBand. Much like NVMe, NVMe-oF can fully exploit the performance potential of flash storage, typically hindered by traditional protocols and interfaces.
Even though NVMe-oF is just emerging, it is already a widely adopted network architecture. It helps enterprises to handle a wide variety of workloads that require the lowest network latency and the highest throughput. However, it comes with some downsides such as increased hardware costs, limited hypervisor support and additional configuration complexity, especially for clustered environments.
Making the Right Choice
In conclusion, the choice between file and block storage, and specifically between their respective protocols, depends on your specific needs and requirements.
- SMB serves as a user-friendly file-level protocol best suited for small to medium businesses running Windows with requirements mostly centred around file sharing and collaboration. Though, it is also a viable option for storing virtual machines, in a high-intensity clustered environment, block-level protocols are usually a more popular choice.
- NFS is widely used for collaborative file sharing and virtualization in Linux environments. Like SMB its mostly tuned for large sequential reads/writes and less suited for more demanding scenarios such as high-performance VM storage and high availability clustering.
- Conversely, iSCSI offers block-level storage with the flexibility to integrate with existing Ethernet equipment, making it a better choice for data-intensive applications in large virtualized environments. iSCSI is a perfect choice for most virtual machine storage use cases providing a great balance between implementation cost and complexity and the resulting performance.
- NVMe-oF, the newest of the trio, leverages the latest technology for high-speed data transfer in all-flash storage environments, ideal for latency-sensitive applications that demand the highest access speeds. High-frequency trading, real-time analytics, and high-performance computing are among the sectors where NVMe-oF can shine.
Overall, understanding the “quirks and features” of each protocol is important for choosing the right one for your virtual machine storage. The choice itself, however, depends on your unique requirements, available resources, and the specific workloads your infrastructure needs to support.
In the forthcoming Part 2, we will delve deeper into each storage protocol and discuss its implementation across vSphere, Hyper-V and KVM hypervisors to help you make the best choice for your IT infrastructure.
This material has been prepared in collaboration with Asah Syxtus Mbuo, Technical Writer at StarWind.