Bandwidth Throttling for Cloud Backup
Bandwidth is a limited and precious commodity so the networking team needs to ensure that no one application is monopolizing the entire network. They need to manage the network capacity and allocate bandwidth per application – allocations that can vary by time of day based on usage of the application. For example, during the day, the priority may be core applications so backups should consume only a small portion of bandwidth. But during non-working hours, while core applications are not in use, backups can go all out and consume as much network capacity as needed. Enter bandwidth throttling.
Bandwidth throttling is a way to control the use of the network going from a data center to an external environment. The networking team typically relies on the storage administrators to be good citizens and throttle the outgoing storage traffic within prescribed limits. This is something that all you readers, are probably well aware of, as this is a basic capability provided by almost all storage products. When we got the request from our customers to implement bandwidth throttling capability, the ask was to provide similar capabilities as provided by standard storage products. However, at Clumio, we are not content with just matching the capabilities provided by other solutions, we want to find if there are ways to improve upon them.
When we dug deeper, we found that all existing storage solutions present a problem when multiple storage appliances/clusters are deployed in one datacenter and the sum total of the replication traffic across these clusters needs to be throttled to a specific value during working hours. This is because existing storage solutions only allow the bandwidth to be throttled on a per-cluster basis. To get around this limitation, storage admins split up the allocated bandwidth equally between the clusters. This has the inherent disadvantage that even if one of the clusters does not have any data to be replicated, the static bandwidth share allocated to that cluster cannot be used by the other cluster(s). Once we validated with our users that this was indeed an inconvenience, we set out to address this for our solution.
As the Engineering team investigated this problem statement, they realized that there was an opportunity to build a generic rate limiter solution – something that could distribute a finite resource like network bandwidth across multiple workers. This is possible because the brains of our operation sits in the cloud and has visibility into the workings of all the workers (Clumio Cloud Connectors). What came out is a beautiful solution that has initially been applied to solve the bandwidth throttling use case.
In keeping with the simplicity goal of Clumio, we present the highly complex backend solution in a very simple way to our users as shown in the figure below.
As can be seen, the user can set two values:
- An overall network bandwidth limit (to ensure that network utilization by Clumio backups never exceeds a specific value)
- Throttle value for any time range for each day of the week (to ensure that network utilization by Clumio backups does not exceed the allocated share during business hours)
Using these parameters, the Clumio service will dynamically allocate bandwidth per Clumio Cloud Connector based on the workload of each Cloud Connector. This ensures that the needs of the user (restrict overall bandwidth utilization by Clumio backups) are met, while also ensuring that there is no static bandwidth allocation to any of the Cloud Connectors, thereby avoiding wastage of the precious network bandwidth.
To experience this and many other sophisticated capabilities of the Clumio service, reach out to your local sales team at email@example.com and one of our highly trained field teams will happily walk you through the entire set of capabilities of our service.