Commvault Unveils Clumio Backtrack - Near Instant Dataset Recovery in S3

// 29 Oct 2020

Improving Amazon EBS backups using EBS direct APIs

Anthony Fiore
ShareTwitterfacebookLinkedin

This is a guest post from Anthony Fiore and was originally posted on the AWS Storage Blog.


At AWS re:Invent 2019, AWS launched Amazon EBS direct APIs. This feature enables AWS customers to list the blocks in an EBS snapshot, compare the differences between two EBS snapshots, and directly read data from EBS snapshots. To perform these tasks before AWS released the EBS direct APIs, you would need to launch temporary Amazon Elastic Compute Cloud (EC2) instances and attach EBS volumes created from the EBS snapshots. Now you can call the APIs from EC2 instances, AWS Lambda functions, or containers. This enables you to read from, list the contents of a single snapshot, or list the changes between two EBS snapshots without having to create EBS volumes from snapshots first. In July 2020, AWS launched an additional set of EBS direct APIs that enable customers to create and write directly to EBS snapshots. We have seen a lot of interest and feedback from APN Partners since we launched the direct APIs.

This blog post focuses specifically on Clumio, an APN Advanced Technology Partner with AWS Storage competency. Clumio provides a secure, multi-platform, backup as a service offering that launched in August 2019, initially protecting on-premises VMware and VMware Cloud on AWS workloads. In December 2019, Clumio’s offerings quickly expanded to include protecting EBS volumes. In this post, I discuss how introducing Amazon EBS direct APIs into their EBS backup workflow helped provide an average of 30% cost savings and an overall 15% reduction in backup times for Clumio customers. As an active participant in the beta program for the direct APIs, Clumio was very interested in implementing them into their EBS backup workflows.

Introduction to Clumio for AWS native services

Clumio provides backup as a service for enterprises with requirements to protect EC2 instances, EBS volumes, and Amazon Relational Database Service (Amazon RDS) instances. Built on AWS, Clumio provides its customers with a simple, agent-less method to protect EBS volumes, with none of the typical complexity and costs of managing snapshots or backup infrastructure. With Clumio, you can protect all your accounts, across Regions, in one single-paned interface. This helps ensure consistency of data protection and opens uses cases for disaster recovery (DR), testing and development, and migration.

Once subscribed to Clumio, Clumio customers simply deploy an AWS CloudFormation template in their account, set up their backup policies, and Clumio takes care of the rest. The process is streamlined and can be up and running in 15 minutes or less. Clumio backs up all data securely in an immutable fashion and provides customers with an ‘air gapped’ copy of their data outside of their account. This ensures that if a customer’s account is compromised or has volumes/snapshots deleted mistakenly, they can quickly restore their data to their account. Finally, as a SaaS offering, Clumio transparently provides updates and enhancements for its services on a biweekly basis, like integration with Amazon EBS direct APIs. This lets Clumio customers focus more on their core competencies, versus spending time maintaining backup infrastructure.

Clumio’s Amazon EBS workflow before EBS direct APIs

When Clumio launched their Clumio for Amazon EBS service in late 2019, EBS direct APIs were not yet available. As a result, whenever Clumio initiated a backup of an Amazon EBS volume, they had to mount the EBS volume on an Amazon EC2 worker instance inside a Clumio EC2 Auto Scaling Group. At that point, Clumio would have to read the contents of the entire volume to determine which blocks had changed, then process and send the changed data to the Clumio service. While this workflow was automated and seamless to the customer, there were some drawbacks. One such drawback was that the EC2 worker instance needed to read the full contents of the volume to determine the changed data during the backup process. Reading the full contents of a volume to determine changes was expensive in both time and cost for customers.

Clumio Amazon EBS workflow before Amazon direct APIs

Clumio Amazon EBS workflow after Amazon EBS direct APIs

In February 2020, Clumio rolled out an update to their platform that leveraged the new Amazon EBS direct APIs. As a result, Clumio is now capable of programmatically determining the changed blocks between EBS snapshots. They no longer have to restore EBS volumes from snapshots and compare the differences on an EC2 worker instance, which greatly benefitted Clumio customers. Clumio’s customers can now perform EBS backups 15% faster, while spending 30% less on AWS infrastructure costs associated with running Clumio worker instances (EC2 and EBS). Clumio still makes use of EC2 worker instances today, to provide services such as deduplication, compression, and indexing for customers before they receive the data in the Clumio service.

Clumio Amazon EBS workflow after Amazon EBS direct APIs

Future plans for Clumio for Amazon EBS with EBS direct snapshot APIs

Clumio, like AWS, is continuously enhancing their services and innovating based on customer feedback. Amazon EBS direct APIs were the result of AWS listening closely to its partners and customers, understanding what features mattered most to them, and innovating with that in mind. Clumio is working closely with AWS service teams on methods that would enable them to introduce additional Amazon EBS direct APIs into their workflow. The goal is to persistently drive down expenses for their customers by integrating AWS cost optimization strategies, while concurrently enhancing their operational efficiency. This includes streamlining processes to minimize expenditures on AWS services, alongside improving backup and restore times for heightened resilience and cost-effectiveness.

One action-item under consideration is completely removing the need for Amazon EC2 worker instances in a customer AWS account. Clumio could seek to do this through introducing serverless workflows that would call the EBS direct APIs for list, read, and write operations. They also have the option of using a serverless architecture to perform the necessary deduplication, compression. Considering these options could further simplify EBS backup and restore operations for Clumio customers.

Summary

In this post, I provided a brief overview of Clumio’s AWS-native service, and covered how their Amazon EBS backup workflows looked before and after the use of EBS direct APIs. The direct APIs are a programmatic way to compare the changes between two EBS Snapshots, which helped Clumio customers save an average of 30% in cost and 15% reduction in backup times. I also discussed how they are planning to enhance their service in the future using more of the EBS direct APIs.

If would also like to consider the Amazon EBS direct APIs for your EBS backup and restore workflow, please review the EBS direct API documentation available on our website.

Thanks for reading this blog post! If you have any comments or questions, please do not hesitate to leave them in the comment section.