StarWind Resource Library

LSFS Container Technical Description

Published: August 19, 2015

Introduction

StarWind Virtual SAN® is a native Windows software-defined VM storage solution. It creates a VM-centric and high performing storage pool purpose-built for virtualization workloads. StarWind Virtual SAN delivers supreme performance compared to any dedicated SAN solution since it runs locally on the hypervisor. In addition, all IO is processed by local RAM/SSD caches and disks, never bottlenecked by storage fabric. StarWind Virtual SAN includes the Log-Structuring File System technology, which coalesces small random writes, typical for virtualized environment, into the stream of big sequential writes. As a result, the performance is increased and the flash life is prolonged.

This guide is intended for experienced Windows system administrators and IT
professionals who would like to know more about StarWind Virtual SAN solution and better understand how StarWind Virtual SAN works. It provides the description of the LSFS container and explains its features i.e. deduplication, defragmentation etc., which should give the end user the necessary knowledge for its implementation into the system.

A full set of up-to-date technical documentation can always be found here, or by pressing the Help button in the StarWind Management Console.

For any technical inquiries, please visit our online communityFrequently Asked Questions page, or use the support form to contact our technical support department.

The Features of LSFS and Their Description

1. What is LSFS:

LSFS (Log-Structured File System) is a journaling file system that keeps track of the changes that will be made in a journal. This file system keeps no data, only the changes.

The journal is divided into file segments for convenience. The minimum size of file segment is 128 MB, maximum – 512 MB. LSFS always keeps one empty file-segment.

Initial size of LSFS device is equal to one empty file-segment irrespective of the device size. When data hit the device, it grows automatically as changes arrive.

Old data is kept on the file system until defragmentation cleans it up.

LSFS supports built-in automatic and manual defragmentation, deduplication on the fly and snapshots.

2. How Snapshots work:

LSFS is a snapshot based file system. Every snapshot is incremental and occupies additional space, which is equal to changes made since previous snapshot creation.

Snapshots are created every 5 minutes by default and then deleted. A snapshot can be taken manually as well.

LSFS also creates restore points during a usual job. This is the latest consistent journal parts, which will be used in a case of failure. Restore points can be observed via Snapshot Manager in Device Recovery Mode.

3. How Defragmentation works:

Defragmentation works continuously in the background. Each file-segment will be defragmented when data capacity exceeds of allowed value. Maximum allowed junk rate before defrag process is 60%. This value can be changed using the context menu of the device.

Data from the old fragmented file-segmented will be moved to another empty file segment and the old file will be deleted.

If available disk space on the physical storage is low, the LSFS uses more aggressive defragmentation policy and slows down access speed for the end user. If there is no space on the physical drive, then LSFS becomes “read-only”.

4. How Deduplication works:

Unique chunks of data are identified and stored while being analyzed. As the analysis continues, other chunks are being compared to the stored copy and whenever a match occurs, the redundant chunk is replaced with a reference shortcut that points to the stored chunk. Given that the same byte pattern may occur dozens, hundreds, or even thousands of times (the matching frequency is dependent on the chunk size), the amount of data to be stored or transferred can be greatly reduced.

For example, similar VMs that are put on top of LSFS device will be deduped well, but paging file of every VM will not. According to our observations deduplication is not applicable to pagefiles, thus the size of every pagefile will be added as unique data. Long story short, 10 similar VMs 12GB each (2GB for pagefile) occupy (12-2)+2*10=30GB.

5. Performance boost:

The log structuring uses Redirect-on-Write for snapshotting, writes, etc. That means every new data block is written to the next available place on the disk organizing the data blocks sequentially. No matter which access pattern is used, the underlying storage always receives 4 MB blocks. LSFS coalesces multiple smaller random writes into a single sequential big write I/O. It allows to achieve up to 90% raw sequential write performance at the file system level, which is by order of magnitude better as compared to conventional file systems (i.e NTFS, ZFS, etc), which is around 10%. Read patterns may vary starting from 4k blocks which eliminates performance negative impact.

Therefore, underlying storage may be the cheap spinning drive. The only side effect of LSFS is possible overprovisioning due to fragmentation and metadata presence.

6. Overprovisioning:

Junk rate predetermines maximum allowed LSFS growth (overprovisioning) comparison to declared LSFS size. The default rate is 60%; therefore, LSFS file-segments might use 2.5 times more space than initial LSFS size. Additionally, metadata occupies up to 20% of initial LSFS size. Thus, overprovisioning of LSFS devices is 200%

It is possible to run full defragmentation via device context menu. Manually started defragmentation ignores default defrag rate and cleans up all junk blocks inside file-segments.

Metadata occupy additional space as well, but contribution of metadata is usually small. Some patterns cause abnormal metadata growth and it may occupy as much space as the useful data. The “worst” pattern is 4k random write. However, when the whole disk is full, the ratio between metadata and data itself should be stabilized and 3 times growth is maximum possible. Information about useful data count, metadata and fragmentation can be found in StarWind Management Console when LSFS device is chosen.

7. LSFS based HA synchronization:

HA based on LSFS uses snapshots for sync purposes. HA syncs only latest changes after any failure, because each HA partner has healthy snapshot before the failure and there is no need to sync all the data, latest changes only that were made after the snapshot was created. Full sync is performed only after initial replica creation. Even in case of full sync only useful data is replicated, junk data is skipped.

Current LSFS Limits and Requirements

1. Required RAM (it is not related to L1 cache):

• 4.6 GB of RAM per 1 TB initial LSFS size (deduplication is disabled)
• 7.6 GB of RAM per 1 TB initial LSFS size (deduplication is enabled)

2. LSFS maximum size is 11 TB

3. Overprovisioning is 200%

LSFS files can occupy 3 times more space compared to initial LSFS size). Snapshots require having additional space to store them.

4. The physical block size of LSFS device is 4k

This basically means that write speed will be low if write requests are not aligned and deduplication is not performed. However, it should not be the problem since all brand new drives use 4k blocks. Starting from Windows Vista it is possible to use such drives. Hyper-V aligns VHDX, but VHD might not be aligned. ESX cannot work with 4k drives, but VMFS should be aligned as well, thus there is no problem.