INTRODUCTION

Backup is often considered as a necessary overhead to any organization. Due to its criticality, however, it should be simple, robust, easy to integrate, and above all, inexpensive. To help customers balance cost, risk, and simplicity, StarWind works seamlessly with different types of backup providers.

It’s well known that the basic principle of building any highly-available environment is eliminating any single points of failure in the hardware and software configurations. Since a single hardware failure can lead to downtime of the entire system, it is vital to achieve the redundancy of all the elements in the system in order to minimize downtime that is caused by any malfunctions. But the important note is that high availability (HA) can possibly fail on the hosts leaving the infrastructure without any way to bring data back, so it cannot be considered as a backup.

This is where backups come in handy. With backups off-site or in the cloud production, virtual machines can be brought back online in case of a catastrophic event. This guide describes best practices for backing up the StarWind Virtual SAN environment. Out there, many admins use different kinds of backup solutions and they all have backup best practices provided by a vendor.

A well-known backup strategy is the 3-2-1 backup rule, which is based on the following principles:

a) Have at least three copies of data.

Setup several Backup Jobs for each of VMware or Hyper-V VMs: it means that one needs to provide at least two backups in addition to the primary data.

b) Store the copies on two different media.

To store data copies admins can use tapes, disks, and more.

c) Keep one backup copy off-site.

Setup Backup Copy Jobs to transfer backups to an off-site location (e.g., another datacenter or cloud). Physical separation of copies is important. It is not recommended to store an external storage in the same room as the production storage. Following this simple rule, admins are always ready for any disaster or catastrophic event.

Different Backup Methods

For many people, it is quite hard to differentiate backup methods. Most would think that a backup is just an identical copy of all the data on a virtual machine. To better understand these methods, let’s describe them first. Three basic backup types are:

  • Full backups
  • Incremental backups
  • Differential backups

Full Backups

As the name implies, this type contains a full copy of the entire data at one or more specific time point. Because full backup stores all files and folders, frequent full backups are time-consuming and often require a large number of tapes or disks. The advantage of this method is that doing a restore operation is faster and easier compared to other methods.

Incremental Backups

An incremental backup method was introduced as a way to reduce the time for a full backup. According to this method, the backup is made to the data that has changed since the previous backup of this data. This method eliminates the need to store multiple copies of unchanged data compared to full backups. The process includes the first full backup and the backed-up server as a reference point for an incremental backup set. After the full backup, several incremental backups are made after successive time periods. The downside of this method is that it takes time to do a restore job because it retrieves the full backup first and then starts bringing back each incremental backup.

Differential Backups

This method uses all files that have been changed since the previous full backup. Many seem to confuse it with the incremental backup. It is important to understand the difference between the two: the incremental backup includes the data that has been changed since the previous backup only. A differential backup contains all the data that has been changed since the last full backup. The advantage of differential backup lies in taking less time to perform a restore job. The restore process includes retrieving a full backup and then applying the last differential backup since the last full backup job only.

Backing up Virtual Machines in the StarWind Environment

With a variety of vendors in the market, it is always important to understand how to do proper backups without causing any issues. When talking about the vSphere environment, it is not a secret that almost all software-defined solutions need to be installed inside of virtual machines. StarWind Virtual SAN is not an exception here. When using StarWind with the synchronous replication feature inside a virtual machine, it is not recommended to make backups and snapshots of the virtual machine with the StarWind service. It could pause the StarWind virtual machine while the StarWind service is under load, thus leading to split-brain issues in devices with synchronous replication and further data corruption. It is recommended to backup data and virtual machines which are located on the StarWind HA storage instead of making a backup of the StarWind virtual machine itself. In addition, StarWind devices (.img) also should not be backed up. The only thing that can undergo backup is the StarWind configuration file (StarWind.cfg).

In most cases, there is no need to backup the hosts or OS volume, where StarWind service is running. In case when there is a requirement to backup the host itself, in order to avoid possible issues with production, all production workload should be moved to another node(s) in the cluster and StarWind VSAN service must be stopped on the node where the backup should be performed.

Design Principles

Once the production storage is set, it is time to configure the backup environment. First of all, it is important to keep in mind the purpose of the backup target and design the storage hardware accordingly. While the production storage needs to satisfy performance issues, in the end, the backup target needs plenty of disk space at a fair price. The more space is available, the longer retention time can be set in any backup solution in the environment, thus ending with more restore points.

A common design for this kind of storage is made of an applicable number of SATA disks since they deliver huge capacity at a very low price. To guarantee data redundancy, consider using a RAID card supporting RAID5. This configuration uses an N+1 disk array that gives the best price-per-GB ratio.

Another element to consider is network transfer capabilities. No admin would want to have the backup target to be a primary bottleneck of backup operations. The network is one of the most important parts of any backup environment. Network bandwidth is just as important as the underlying storage performance: they both make or ruin performance of applications. For most backup solutions, it is recommended to have a dedicated network to achieve the best performance.

As described above, there exist the following typical backup solution schemes for the StarWind Virtual SAN. In these schemes, the backup solution is located on a backup server that connects to ESXi or Hyper-V. In the process of network topology planning, It is a good practice to always have redundant connections coming from the backup server connected to the switch. In most scenarios, having two switches will help to achieve proper redundancy and eliminate a single point of failure. In case extra switches cannot be added to the configuration, the use of multiple connections will do the job. In the following diagrams, there is a typical two-node setup (ESXi or Hyper-V). For other setups, such as a 3-node setup, the process of adding a backup server is the same. For more information on all supported setups, please refer to StarWind Virtual SAN: Best Practices.

StarWind VTL Appliance

Most admins are familiar with tape backups. Speaking about StarWind VTL Appliance, it is possible to obtain the required tape backups and offload them to the cloud. In addition, VTL Appliance can also be used to store backups (Storage Repository). Note that the above-mentioned 3-2-1 backup rule is already in place with StarWind VTL Appliance. Thus, the cloud replication can be configured there to make sure that at least one copy of backup is located off-site, which guarantees the protection from a ransomware attack. In order to fit the ransomware resiliency for the local environment, the Virtual Tape Library should be located on the dedicated storage and host (which is StarWind VTL Appliance itself) that is isolated from the production environment. The best way to achieve this is to not join VLT Appliance into the domain, disable file shares on it, and use a separate network for backup purposes. The main advantage of the VTL Appliance is that it goes fully pre-configured and ready to integrate into any existing infrastructure.

Lowering RTO/RPO

Lowering RPO requires higher data protection. In other words, backup or replication jobs need to occur more frequently. More often the data is being backed up, the less data needs to be re-keyed in the event of failure. Lower RTOs can be achieved by improving network transfer speeds or by making the second data copy rapidly available, eliminating the need to copy it across the network or upgrading the network between the secondary storage and the primary storage. In general, lowering RPO and RTO leads to the data protection process becoming more expensive and complex.

CONCLUSION

Following the best backup practices, the advantages of an appropriate network topology can be leveraged for both production and backup storages. A combination of StarWind Virtual SAN and a backup solution allows building the storage infrastructure where users can run their virtual machines and backup them safely.