Improve your cloud data security. Get the complimentary MIT report.

// 10 Jun 2022

Automating End-to-End Data Protection with Terraform

Lawrence Chang
ShareTwitterfacebookLinkedin

Clumio Partners with Terraform

A fundamental principle in recent years for IT and DevOps has been to consider infrastructure as code. Similar to how application code has defined syntax and formatting as well as a reproducible binary, a company’s infrastructure should be managed and provisioned in analogous fashion. One of the most well-known infrastructure as code tools is Terraform which provides key syntax and structure to help companies minimize environment drift and automate end-to-end reproducibility of cloud-based environments. As such, Clumio has embraced Terraform from the start for both our production and internal cloud SaaS environments. However, given our extensive use of Terraform, we began to ask  “What about backup?” Should it not be as easy for our customers to maintain and reproduce their data protection environment?

Such were the questions that propelled us to the recent development of our own, full-fledged Terraform provider and led to Clumio becoming a Hashicorp (the makers of Terraform) Technology Partner. Clumio’s provider exposes a set of rich resources as well as a configurable module that abstracts the use of the Clumio backup as a service for AWS. From connecting multiple AWS accounts and regions, to setting up policies and protection rules, to adding users and creating organizational units, the Clumio provider supplies customers with an easy to define, reproducible data protection environment.

The following is a quick overview of how to get started with the Clumio provider. As the provider uses APIs to abstract the use of the Clumio cloud, you should create an API key from the Clumio UI or retrieve an existing one. For help with creating an API key, please refer to the Clumio documentation. The subsequent steps assume that such an API key is available to you.

Preparing Your Terraform Automation

Start by setting up the following environment variables to allow the Clumio provider to interact with the Clumio cloud on your behalf. For allowed API base URLs, please refer to the Clumio provider documentation:

The AWS Terraform provider is used by the Clumio module to provision the resources required to perform data protection in the AWS account to be protected. As such, set the following additional environment variables:

The following starter Terraform configuration sets up for the required Clumio and AWS providers. Download the providers with terraform init:

Connecting Data Environments

Next, add the following to the Terraform configuration to instantiate a Clumio connection to the AWS account associated with the AWS environment variables setup during Preparation. us-west-2 is specified as the region in which to install the Clumio module.

Confirm your work thus far with terraform init to download the Clumio module and then terraform plan to inspect what resources will be provisioned. NOTE the above Terraform configuration enables support for data protection on all AWS data sources. When ready run terraform apply: 

Your AWS account and region are onboarded! You can confirm this from the AWS Environments page on the Clumio UI:

Automating Data Protection

To get started with backup, include the following in the Terraform configuration to create a Protection Group for S3, define a policy for it, and associate the two together. As a result, any S3 bucket with the tag key-value clumio:blog will be protected:

Again confirm your work with terraform plan (terraform init is not required) to inspect what resources will be provisioned. When ready run terraform apply:

… and that’s it! Any S3 bucket with the tag key-value clumio:blog will start to seed and subsequently backup every 7 days.

More Resources and What’s Next

With just the above steps, you can take your data protection infrastructure and start to manage it as code. While the above walks you through a simple data protection setup, you can find more examples in our Clumio provider documentation. This includes how to connect and protect multiple AWS accounts and regions as well as how to organize and manage multiple users and organizational units. Documentation for each custom resource supplied by the Clumio provider can also be found.

Additional improvements and plans for the provider are continuously in-discussion and support for new features and AWS data sources will be added in subsequent releases. If you happen to already be using Terraform to manage your infrastructure, the Clumio provider is the perfect complement for your data protection needs (and if not using Terraform, this is a great chance to give infrastructure as code a try). We certainly welcome additional feedback from the community as we look to improve upon the provider. Better yet, if you want to contribute to our repository, we’re happy to take pull requests. Happy provisioning!

This post was originally published on Clumio Engineering on Medium. Subscribe to our blog for exclusive content from our engineering team.