Tuesday, 17 April 2018

Why Starwind Cloud VTL or getting backup data to cloud object storage in 30 minutes

So many cloud storage providers offer object-based storage nowadays. Unfortunately, backup software vendors are not fast enough with updating their products to allow companies to consume new storage tiers to existing backup infrastructure. 

With few Whys I am going to explain how StarWind VTL brings value to the companies by providing access to a new cost-effective cloud storage.

Why backup?

Alright, alright, I am kidding here. No doubts you know the purpose of data backup. Just wanted to remind you that people make mistakes, computer hardware fails, and natural disasters occur. So, it is better be safe than sorry. 

Why tapes?

Historically tapes have been very attractive backup media due to tape drive's reliability and low cost. 10-20 years ago, the performance of tape libraries and the amount of backup data still allowed to meet the backup window. Even today tape backup may still be a viable choice for SMB companies.

Also, tapes are perfect for long-term data archiving. On the contrary, archiving data on disks is not practical. Who would want to store data, let's say for 7 years, on disks paying for power, cooling and space? 

Scalability was another benefit you get with backup tapes. It is much easier to buy additional tapes to get extra disk space compared to disks where you would have to buy new disk enclosures, reconfigure storage arrays. 

Finally, tapes are mobile. Moving tapes offsite is a common practice to allow data restore in case of disaster recovery.

It ought to be mentioned that very often tapes do compete with disk, but rather complement each other. For instance, the Disk-to-Disk-to-Tape approach is still a quite common backup technique. 

Why Virtual Tape Library?

According to Wikipedia, 'VTL is a data storage virtualization technology', or in other words is an abstraction layer which lets you quickly change the underlying backup media. It still logically presents the familiar tape libraries and tapes thus minimising the knowledge curve that usually comes with new technologies. This allows administrators to keep using familiar backup software and policies. 

The important improvements VTL brings are performance and mobility. Even with explosive data growth, VTL manages to fit the backup job into a reasonable time-frame by accelerating it, so the process does not overlap with the production time window.While it is relatively easy to scale out physical tape drives to improve the backup time, they are not very efficient when you need to recover data very quickly. This is where VTL performance shines the most as its data access time is very low compared to physical tapes. 

VTL brings backup data mobility and security to a new level. Moving data offsite over network minimizes the risk of sensitive data theft because the access to data can be easily controlled and audited. 

The geographical location of the offsite storage becomes less important. Virtual Tapes can be copied to offsite datacenter or even to the cloud as long as there is sufficient bandwidth. 

Why Object based storage?

While VTL is a great concept for storing 'warm' backup data there is a fundamental issue with its scalability. This is mostly due to the VTL power and space footprints and high cost.

The object storage on the contrary allows higher consolidation ratio, better deduplication ratio due to a single deduplication domain, very efficient scalability. Also, if you look at the object storage specs you may notice that in a way they resemble physical tapes - no random writes, very large I/Os. 

Yes, object storage is not great for latency-sensitive applications, but that is not required for backup data.

All this make object storage a perfect storage tier for long-retention archival. Even in terms of TCO object storage is getting very close to physical tapes. 

Why Starwind VTL?

The answer is very obvious. The StartWind VTL is a universal gateway to the cloud and on-prem object-based storage. 

The StarWind VTL solution could use AWS S3 Storage since 2017. In the latest release StarWind has added few other cloud storage providers. So, the full list looks the following:

·      AWS S3 and Glacier
·      Backblaze S2 Cloud Storage
·      Microsoft Azure Cloud Storage

So, essentially it is a software that provides that abstraction layer between your backup product and cloud storage providers, thus, achieving an effortless integration with the object storage without the need for installing several third-party software components.

StarWind VTL improves the classic 3-2-1 approach with a new 4-3-2-1-0 concept

Traditional approach dictates to have 3 copies of data on 2 different media while storing 1 copy off-site

The screenshot from the StarWind Cloud VTL presentations depicts the new concept. It suggests using 4 copies on 3 different media with 2 copies stored offsite, achieved in 1click operation and with 0 issues. 

The diagram is a courtesy of StarWind 

Proactive support introduced in the latest build of the StarWind VTL is an icing on the cake. Here is the Proactive Support high level workflow:

·      telemetry collected & analysed with AI
·      failure pattern detected and logged
·      support prevents an issue from happening 

According to StarWind presentation at Storage Field Day 15 "90% of issues are resolved with ProActive support before they actually happen"

Hardware and System requirements for StarWind VTL are pretty low.  Intel Xeon E5620, 4 GB of RAM and 1 GbE NIC is the minimum that lets you use the product. If you plan to install Veeam B&R on the same server, you will need to beef up the server specifications. The largest question would be the amount of disk space that will meet the requirements of the retention policy - how long the virtual tapes will be stored locally before offloading them to the cloud storage. 

Let’s have a quick look at the components of the StarWind Cloud VTL solution: 

VTL Server:  the software responsible for emulating physical tape library
Veeam Backup and Recovery: one of the best backup product I know
Tape Library drivers: allow communication between backup server and VTL
Backblaze storage bucket*cloud object-based storage
* bucket is an object storage term, it is used to logically group objects. 

Now let's have look at Clout VTL topologies.

There are few ways to deploy this solution. The first diagram depicts the setup you would probably use in a Proof of Concept project. This solution does not consume a lot of resources and at the same time allows to test all the features of powerful combination of Veeam B&R and StarWind VTL. 
It is not recommended to use this setup for production environment 

Figure 1 - Single Server Topology

The second topology is not a reference architecture, but rather my attempt to show that components of the solution can be spread across multiple servers. This flexibility enables administrator to scale out/scale up the solution to meet the backup performance requirements. 

Figure 2 - Distributed Topology

On the diagram above the Tape Library server is where all the 'magic' happens. The StarWind software emulates HP MSL tape library and drives. This virtual tape library is then presented to Veeam B&R Server as an iSCSI target. 

The virtual tapes can then be stored on any local or shared storage. Once the backup job is complete the virtual tapes can be replicated to Backblaze cloud storage (or another cloud storage provider). After successful replication the tape can either be deleted or stored locally to provide a faster recovery if needed. 

The installation document thoroughly covers all the steps and it took me less than 30 minutes to install all components and get first virtual tape replicated to Backblaze. 

To summarise, StartWind VTL provides the following benefits:

  • Disk to Disk to Cloud backup technique while ensuring the compliance with 3-2-1 backup rule
  • Access to multiple cloud object-based storage providers
  • Allows to get rid of the physical tapes 

I personally believe StarWind VTL will be in a high demand until backup software vendors enhance their applications to integrate with all cloud and on-prem object-based storage. This process could be accelerated by the development of a single unified API standard for object-based storage, but I am not sure if it is happening soon.