Unpredictability and the Unfulfilled Promise of Cloud Discount Programs • Lucian Systems

As companies seek to cut cloud costs, more are looking to take advantage of the saving opportunities provided by AWS discount offerings and avoid paying the hefty price tag for On-Demand. But what is a realistic level of savings?

For some resources, savings potential are underused, leaving millions in savings on the table. For example, in the FinOps community there is an ongoing discussion about ideal coverage accessed through discount offerings such as Reserved Instances (RI) and Savings Plans. It’s disheartening to read that companies generally consider 80% application of discount programs to workloads a measure of success for stable and predictable workloads!

The opposite side of the coin, however, is that companies are overprovisioning all types of resources, from storage to CPU and more, to ensure consistent application performance and stability. With business and compute needs constantly changing and with a larger percentage of unpredictable workloads than predictable ones, there often seems to be no good way to balance the two. While application performance and stability are unquestioningly critical, the price tag for these excess resources is enough to infuriate any CFO.

But the most maddening part? The fact that there are solutions to this challenge and the conversations about ideal RI coverage, among others, should already be a thing of the past. While RI coverage is only one element of cloud cost management, FinOps teams can leverage newer tools and applications to not only automate huge amounts of maintenance but also reduce financial waste.

Are Fully Cost-Optimized Workloads Realistic?

Ultimately, most companies arrive at a fork in the road. Either take on an aggressive discount program strategy and swallow the costs of over-provisioning, or be risk-averse and accept that an unpredictable chunk of costs will be generated from running on on-demand cost models, and pay the premium it entails.

When deciding on the level of discount commitments to purchase, these factors have to be weighed against each other, but only after assessing the workload needs of every instance, from its stability to predictability to consistency. But that is easier said than done.

Case in point: forecasting was listed as FinOps’ second biggest challenge in 2022. And while there has been an improvement in monthly forecasting practices, it is still a struggle to accurately do so. Even as forecasting improves and companies can better utilize discount commitments to avoid more costly On-Demand models, the actual execution and adjustment of resources will still require hours of valuable manpower.

But let’s say you invest in forecasting and want to really begin maximizing discount programs, current operational practices are still highly conservative.

Common advice for workloads that experience more variability, is to aim for lower coverage. In fact, companies that are just starting FinOps practices generally begin with just 30-50% coverage to enable more flexibility and increase coverage gradually as they can attain greater predictability and understanding of their environment.

Optimizing workloads to save money and resources while maintaining performance and stability is a prize yet to be achieved for far too many organizations.

A Predictable Response: Machine Learning

Machine learning and automation seem to be the answer to most problems these days. And why shouldn’t it be? The tools to automatically prevent over-provisioning as a FinOps practice are readily available.

Spare time amongst DevOps teams is rare. Even for the most efficient developers on the planet managing the moving parts of the cloud to perfectly fit shifting company needs at all hours of the day and night is impossible due to internal and external factors. Launching new products and new features, moving from development to production, migrating from monolith to microservices, or factors beyond the control of the organization, such as changes in market demand, make it impossible for resources to be manually adjusted across hundreds to tens of thousands of applications.

It is not just about saving DevOps teams time. Re-allocating discount programs on a moment-to-moment basis and increasing visibility to assist with tracking cost anomalies while maintaining knowledge of ongoing trends will also save entire departments a nice chunk of change.

Employees want tools to manage their cloud environments better. A survey from Microsoft found that 90% of people want simpler automation tools to streamline daily management tasks in order to allocate more time to strategy, and there is no shortage of them. So if automation practices can also lead to higher coverage standards, ultimately saving money, reducing waste, and lowering FinOps’ budgetary stress levels, then it should simply be yet another reason to lean all in.

FinOps’ Full Potential

If one thing is clear about the future of the cloud, it’s that “cost-by-design” needs to be more widely adopted. Traditional priorities during the design stage have been security or performance, and most enterprises don’t fully invest in FinOps capabilities until they’ve reached $100 million a year in cloud spend.

If FinOps is “the key to unlocking the promises of cloud computing,” why does the community still consider it a triumph to get stuck with 20% of their instances running on premium-priced models? This monetary waste can climb into millions of dollars annually. The FinOps community and the enterprises it serves are long overdue to revamp their KPIs to meet new heights that are being unlocked by automated solutions.

By Maxim Melamedov