Commvault Unveils Clumio Backtrack - Near Instant Dataset Recovery in S3
As Amazon S3 celebrates its 18th birthday, it’s clear that it has surpassed the realm of storage services; it is now the cornerstone of modern data architectures. It has played a crucial role in the transition from monolithic applications to decoupled, economical storage, profoundly impacting the way software is written today. For the past four years of this journey, we at Clumio have had the privilege to partner and innovate with the Amazon S3 team, bringing resilience, compliance, and governance to hundreds of its customers.
Amazon S3 has catalyzed the disaggregation of storage from compute, leading to a colossal shift in application architecture. Customers are now able to seamlessly aggregate petabytes of data from diverse sources without being constrained by attached compute. Massive datasets can now be stored and accessed economically, while compute-intensive tasks like training models and running analytics can be scaled separately to meet the demanding requirements of internet-scale applications, data lakes, and AI. Clumio, itself architected as a stateless data processing pipeline that persists most of its data immutably in Amazon S3, helps customers add resilience and protection to these workloads.
The Pi Day event, hosted by AWS annually on 3/14, this year underscored the acceleration of data lakes and their crucial role in generative AI. With over a million data lakes running on S3, solutions like Iceberg and Delta are becoming household names in the AWS ecosystem. To cater to this rising demand, Amazon S3 has also made significant strides in performance optimization through innovations like S3 Express One Zone, crucial for handling the high throughput of billions of objects that generative AI workloads demand. Furthermore, the seamless integrations with Amazon Bedrock and other AI services such as SageMaker and Vector Store exemplifies how Amazon S3 is facilitating seamless AI operations.
The GenAI workflow has distinct storage stages, and Clumio provides resilience for AWS customers across all these stages.
What makes Clumio different is that it’s not built on traditional filesystem-based architectures from the VM-based world. Traditional backup solutions start to exhibit scalability issues at any meaningful size, say, a few petabytes of LLM training data. Clumio helps consolidate, optimize, and streamline disparate data copies and resilience mechanisms into one serverless, scalable platform. For example, in LexisNexis’ data lake, no solution existed for them to protect and restore their billion-object environment within their target SLAs before Clumio.
In addition, with the unstoppable force of generative AI meeting the immovable focus on regulation, there is enormous flux in the domain of AI compliance. This is playing out in front of our eyes—Clumio has some famous genomics customers that are using AI for drug discovery, and they want to do it responsibly because new regulations will require backward traceability for years, perhaps decades. They use Clumio to identify, classify, intelligently backup, and encrypt in an immutable fashion those datasets that they foresee being subject to regulation.
We’re beginning to see evidence that regulation for AI will emanate from regulation of the source datasets. When it comes to verticalized AI solutions, it will still be SOC2 and ISO27001 compliance for most enterprises, HIPAA for life sciences, medical and healthcare companies, COPAA for edtech and online gaming outfits, various FINRA and SEC regulations for financial services providers, and some very specific regulations created by the regulatory bodies manufacturing, automotive, and heavy industries.
At Clumio, we recognize that Amazon S3 is among the defining technological innovations of the 21st century. And its influence will only accelerate in the age of AI, as it becomes the de facto storage substrate for AI development in the cloud. And for all customers of Amazon S3, Clumio is committed to delivering unwavering resilience, automated and at scale, for AI development now and in the future. Not just as a backup provider, but as a partner in innovation.
Join Clumio in a chat with AI and IA pioneer, Pascal Bornet, as we discuss: