Are Cloud Data Protection Gaps Slowing your RDS Adoption?
With all the craziness of recent times, I often think back at past vacations and reminisce on how much fun they were: time with family, going on new adventures, and creating new memories. Then I remember that one family vacation that I forgot my passport and it derailed everything. When I realized it, my heart sank, knowing full well that my trip would get canceled or delayed. This is the same feeling that many cloud architects have when they realize that the standard data protection methodologies don’t exist in the public cloud, complicating and adding risk to their cloud journey. While we have seen this occur with IaaS and EBS, PaaS services such as RDS present an even more pronounced “simplicity begets complexity in risk” paradoxical effect. RDS provides a ton of benefits to remove the complexity of database infrastructure management to enable more agility in the cloud and the last thing you want is to deal with high costs and complexities to protect your databases.
To understand these gaps, let’s first take a look at standard on-premises data protection methodologies which includes three main layers of protection to ensure business requirements are satisfied.
- Snapshots provide faster operational recovery typically on the same storage infrastructure or application stack protecting against accidents, mistakes, deletions, or data issues within a short time duration (typically 1 – 30 days max).
- Primary backups protect data outside the storage or application layer to guard against a complete outage, corruptions, or other data loss that take out production and snapshot tiers (typically for 1-2 years). This is separated from the storage infrastructure to ensure failure on the primary array doesn’t impact the backup.
- Compliance or Offsite backup provides coverage for legal and compliance requirements typically ranging from 1 – 7 years, but can range all the way out to 21 years for many healthcare organizations. To ensure this backup is always around, many organizations create an air gap backup to make sure that any issues on-premises (accidental or malicious) don’t impact the ability to restore or provide a data copy for compliance.
Each methodology has a product built for a specific purpose and each product has a different cost model depending on the value it brings to the enterprise. There are separate products to ensure there is protection at each level and an air gap between each solution to avoid compromise of one platform. The cost reduction at each tier made long-term retention much more palatable. Can you imagine the costs and risks of keeping 7 years of backups on your primary storage array via snapshots? It would cost you a fortune and not provide the air gap protection.
Now, let’s take a look at what data protection products are available when moving applications to the public cloud.
As you can see from the diagram above, the data protection methodology looks much different in the public cloud. You get…wait for it…yes, snapshots. Snapshots for everything! Now don’t get me wrong, snapshots are awesome and a lifesaver for many use cases, but they leave two massive holes in cloud data protection; primary backup and compliance backup. To address these burdens, many enterprises hack together a series of scripts to make snapshots cover all three layers, which result in the following:
Vulnerability to Account Compromise and Ransomware
Snapshots provide quick restores because they are in the same account as the production data or database, which is awesome unless your account gets compromised. When the bad guys (internal or external) get access to the account, they compromise the data and the snapshots. To get around this vulnerability, snapshots must be replicated to another AWS account to create an air gap solution to ensure access to the production data does not by default mean they have access to your backup. Unfortunately, replicating RDS snapshots doubles the cost (local and remote snapshots) and incurs replication charges as well. After your initial snapshot, subsequent snapshots are traditionally incremental. However, if you replicate any RDS snapshot, it must be a full copy. This factor increases the snapshot management cost model dramatically.
High Costs for Long-Term Retention
Snapshots for AWS EBS and RDS are expensive so you can restore them fast for point-in-time recovery, long-term retention doesn’t require this value. Unfortunately, this results in increased costs for long-term retention making compliance nearly impossible without breaking the bank. EBS snapshots are pricey, but our customers are seeing a 30% reduction with Clumio. RDS snapshots are a whole new level at almost twice the cost ($0.095 per GB vs $.05 per GB) making them almost completely unusable for long-term retention.
Snapshots have significant overhead for management and restores. Most of the time they require scripting to get around the limitations. RDS specifically limits automated snapshots to 35 days and limits manual snapshots to 100 per account, so you have to take care that your scripts don’t hit the limits. Even restoring a few specific records on RDS is no cakewalk. The process includes replicating the snapshot back to the primary account, creating a new instance, querying the database for the records you would like to restore, then inserting them into the database. If you forget to tear down the database you restored from, then you will face more hidden costs. The soft costs of an experience that is complex and error-prone to support are rivaled by a growing AWS bill.
What options exist outside of AWS?
The competitive landscape is covered with snapshot managers masquerading as backup solutions. In reality, they are just orchestrators creating a veneer atop all of the same pain mentioned above. The journey to the cloud is not often an easy one. It’s even more difficult if you don’t have what you need when you get there. In my next blog post, we will dig into how to accelerate the move to the public cloud for mission-critical databases and how to evolve into PaaS with more than just snapshots in your toolbelt.
Until next time, stay SaaSy my friends…