Commvault Unveils Clumio Backtrack - Near Instant Dataset Recovery in S3
Picture this: A surrealist oil painting in which an AI-powered sailboat whose sails are made of money is at the crest of an enormous wave. The boat is about to surf down the front of this wave, and in doing so will cross the starting line of a race that will finish in 2026.
That’s the prompt I typed into Dall-E 3, and the resulting image it generated after a few iterations. Why on earth did I ask a generative AI tool to conjure this image? Coming out of re:Invent you can’t help but think about AI in 2024, and this image sums up my high-level thoughts on the topic. Read on for my thoughts on the intersection of sailboat racing, Gartner’s Hype Cycle for Artificial Intelligence, and Dr. Werner Vogels’ AWS re:Invent keynote focus on cloud cost.
Anyone with an internet connection in 2023 has been hit with an onslaught of chatter about artificial intelligence, especially generative AI. So it’s fitting that Gartner placed genAI on their 2023 Hype Cycle at the very top of the “Peak of Inflated Expectations,” estimating it will reach the “Plateau of Productivity” in 5-10 years. What that means for the next year or two is a precipitous drop into the “Trough of Disillusionment.” (Isn’t that where Atreyu lost Artax in The Neverending Story?)
In plainer language, the market is currently as excited about generative AI as it’s likely to ever be. It’s literally all downhill from here, but that’s not entirely bad news. While companies focused on genAI will start experiencing greater skepticism, on the other end of the cycle will be true market acceptance. The companies that make it through the wild ride will reap the rewards. So how can companies developing generative AI products set themselves up for future success today?
Dr. Werner Vogels’ AWS re:Invent keynote speech focused on the importance of building cost-conscious, scaleable architures, especially as companies evolve and markets mature. He drew parallels between cost and sustainability, and while little evidence was provided related to the implied environmental sustainability, cost optimization is an obvious element of business sustainability.
For companies developing generative AI applications, business sustainability is going to be an important focus on the downward slope into (and eventually back up and out of) disillusionment. The market has seen genAI’s potential, but is also starting to see the limitations of early iterations. The companies that make it through the upcoming era of skepticism will need to show their value in multiple ways.
Customers will need to see products that work as advertised, proving their worth with minimal hallucinations, maximal authenticity and real utility. Getting there will require better models and a constant stream of high volume, high quality data. At the same time, investors will need to see a sustainable business that includes sound financials with a clear path to profitability that includes controlled, predictable costs. This means architecting for scale and cost efficiency should be builders’ top priority.
Experts have suggested that over time, models will become somewhat commoditized, and the greatest competitive advantage will come from datasets. This expectation underscores the criticality of data and relatedly, maintaining its availability, security and resilience. But at the speed and scale required for generative AI, these datasets will be inundated with API calls, making resilience efforts that much more challenging.
The start of a sailing race is chaotic. Unlike other forms of racing in which you start from a standstill, sailboats cross the starting line already moving at full speed. The race really begins before the official start, with boats vying for an optimal starting position. They come within inches of one another, aiming to cross the starting line just as the gun is fired and in position to take maximum advantage of the wind. Cross the line at the right time, in the right position, with the right sails and you’re golden. Too early or late, with the wrong sails or in the wrong spot, and your race is pretty much over just as it’s begun.
Right now, generative AI builders are moving at full speed, about to cross that starting line, and not all of them will make it to the finish, the “Plateau of Productivity.” The race before the race has been chaotic (Bard getting a little too emo) and exciting (ChatGPT’s ubiquity). While some differentiation is already apparent, we’re far from declaring winners and losers. A lot can happen in the upcoming years.
Companies that have already built promising products while architecting for cost and scalability will get the best starts, but it’s still possible for others to make strategic moves that will be beneficial in the long run. Those changes would include setting up reliable pipelines of high-quality data, infrastructure that will scale in a cost-effective manner along with that data, and tools for data governance to ensure ongoing efficiency.
As companies position themselves across the starting line, there’s a third important aspect of business sustainability that is particularly important to generative AI companies; the protection and preservation of critical data.
AI relies on vast amounts of data for model building, training and inference, and that data is generally housed in a data lake, which is often stored in Amazon S3. The criticality of that data necessitates backing it up, a task that gets more challenging as data volume grows and API calls increase.
Backup strategies that were workable in early days can quickly become expensive and failure-prone as data scales beyond the solution’s capabilities. Worse yet, backup solutions that don’t adjust to optimize API pressure can bring the whole application down. The right sail can become the wrong sail as conditions change, and in the worst case can become an anchor.
The easier way is to start the race with a sail that automatically adjusts to conditions. In data terms, one with automated, infinite scalability even in high-change environments. Clumio backs up data lakes as small as a few hundred objects and single terabytes, as large as 40 billion objects, hundreds of petabytes, and all sizes in between, maintaining low RPO point-in-time recovery at up to 20 million changes per hour. It helps customers keep cloud costs in control by finding orphan snapshots and other unnecessary data copies, and aligns with cost-conscious architecting by offering affordable, consumption-based pricing.
The race is on. Generative AI companies need to build better products while focusing on business sustainability through scalability and cost consciousness.
Start your race ahead of the competition:
Learn more, Get a demo, or Start a backup. Happy sailing!