Generative AI and AI Infrastructure from the Cloud Services Providers

a Generate button surrounded by futuristic images

Infrastructure-as-a-Service (IaaS) providers are embracing the transformative impact of AI on the cloud computing landscape. The “Big 3” cloud services providers, AWS, Azure, and GCP, all placed a heavy emphasis on their AI services during their latest earnings reports, citing difficulty keeping up with the rapid demand. AWS in particular just had their annual re:Invent conference earlier this month with major keynotes heavily focused on AI announcements.

As many organizations look to decide whether to enter into a new cloud commitment contract or renew an existing agreement with one of the hyperscalers, keeping up to date with the changes the providers are making due to AI is going to be incredibly important. Specifically, the hyperscalers are focusing their attention on two key product areas: Generative AI and AI Infrastructure. Both of these service categories can have an impact on the commercial aspects of your spend commitment and associated incentive packages.

This blog outlines the high-level services that each of the hyperscalers offers in these AI categories, how the solutions are priced, and some important considerations organizations should keep in mind as they invest into the cloud providers.

What Generative AI Services do AWS, GCP, and Azure Offer?

With Microsoft’s total investments in OpenAI amounting to an estimated $13B, and both AWS and Google completing multi-billion-dollar funding rounds to Anthropic, it’s clear the hyperscalers are making substantial investments into Generative AI. While these offerings can provide enhanced capabilities, it’s important to understand the nuances of how these solutions are being positioned.

Key AI Products

  • Amazon Web Services (AWS): Amazon Bedrock
  • Microsoft: Azure OpenAI
  • Google Cloud Platform (GCP): Vertex AI Platform

These products are largely defined as a managed service that offers access to the hyperscaler’s own model (Amazon’s Nova, Google’s Gemini models, or Microsoft’s OpenAI/GPT4) and to other Foundational Models (FMs). The FMs come from AI companies like Anthropic, Stability AI, and AI21 Labs. Additionally, these products offer the capabilities necessary to customize the FMs to solve a specific business need.

Pricing for these solutions is based on input/output tokens (units of text and characters) for text generation models and by image count and image quality for image generating models. For organizations looking to customize the FMs for their own specific use-case or train them on a set of data, the hyperscalers essentially charge based on the number of tokens being processed by the model during training as well as for any model storage.

It’s also important to note that these vendors are highly motivated to sell their GenAI assistants, which are priced more on a per user per month metric. However, depending on your relationship with the vendor, this may be a separate negotiation outside of your cloud commitment contract.

Although we always recommend positioning the vendors to view your spend as an organization holistically, it’s key to understand these nuances, especially when it comes to negotiating with Microsoft, who is notoriously keen on keeping your product and cloud spends separate.

What AI Infrastructure Services Do AWS, Azure, and GCP Offer?

The rise in demand for AI came with the need for enough compute power to specifically handle compute intensive AI workloads. Historically, for these types of higher data or graphics intensive Virtual Machines (VMs) have been used to support workloads in use-cases like weather forecasting, financial modeling or discovery and research. The hyperscalers have now begun to re-focus these high-performance machines towards AI because of the high compute intensity that activities like model training and deploying Large Language Models (LLMs) can require.

VM Instance Families for AI Workloads

High Performance Compute (HPC):

  • AWS: EC2 Trn 1 and EC2 Inf1
  • Azure: H Series
  • GCP: H3 Series

High Performance GPU-Based:

  • AWS: EC2 P5 and EC2 G5
  • Azure: N Series
  • GCP: A3 Series

Pricing for these VMs is going to be structured the same way as your general-purpose compute, but with a higher price tag. Here is a basic example comparing AWS’ High-Performance GPU-Based EC2 instance to a general-purpose instance with a comparable instance makeup:

Instance Name On-Demand Hourly Rate vCPU Memory Storage Network Performance
G5.xlarge $1.01 4 16 GiB 1 x 250 NVMe SSD Up to 10 Gigabit
M7gd.xlarge $0.21 4 16 GiB 1 x 237 NVMe SSD Up to 12500 Megabit

~4.7x the cost for HPC vs. general-purpose

These HPC VMs typically come at the highest cost across the different machine-type families. Instead of just paying on-demand, customers can leverage Reserved Instances and Savings Plans from AWS and Azure or Committed Use Discounts from GCP to help drive savings, especially if your organization expects a stable or predictable usage of these VMs.

However, the simple bottom line is that the infrastructure required to support the rising demand for AI is going to push cloud costs higher. As organizations look to adopt these services into their environment and further embed AI into their business processes, they need to be thinking strategically about how these costs will impact their programs.

How Should Organizations Evaluate Generative AI and AI Infrastructure Services?

Organizations evaluating any of the above services should think critically about the following three considerations:

AI Demand and Forecasting

Understanding and forecasting your usage for both GenAI and AI infrastructure solutions from the hyperscalers, as well as the associated costs, is a key action. This is especially true for customers who are approaching a commitment contract negotiation.

The hyperscalers are highly motivated to drive adoption and spend on AI related offerings, so customers should go through their own due diligence and demand forecasting process to validate what is required and necessary to help drive key business outcomes. Relying too much on the hyperscalers’ projections can result in an unnecessary cloud spend or overcommitment.

Cost Management and Discounts

Higher cost AI-specific VMs can drive up cloud spend and potentially warrant a re-evaluation of any existing cloud commitments. The hyperscalers offer a variety of commercial incentives for customers willing to commit to a certain level of spending.

At a high-level, these incentives are largely dependent on how much you are willing to commit to, which products or services you’re committing to spend on, and for how long you are willing to make that commitment. With regards to AI, the hyperscalers have begun to include specific incentives tied to the adoption of AI tools.

These incentives are often in the form of credits that cover a significant portion of the cost related to these tools. Finding the appropriate balance of maximizing incentives through both the overall commitment and key product commitments while also mitigating the risks of overspending will result in the best setup for success.

Future-Proofing and Downstream Protection

Before adopting the latest in AI tools, customers should ensure that they have the appropriate level of governance and risk management processes established. Validate that the hyperscalers provide the necessary security provisions, bias and fairness considerations as well as compliance to both regulatory and legal risks. This will enhance both your data and privacy protection, as well as your overall cybersecurity.

Commercially, there are provisions that can be negotiated into commitment agreements to help mitigate the risks of under or overconsuming against your commitment and that provide additional flexibility and protection as you look to renew your agreement. These terms can help protect organizations that are increasing their spend because of the higher cost of AI-specific compute or considering buying into new AI tools.

The Bottom Line

AI is a quickly evolving landscape, and for customers who are locked into three or even five year agreements with the hyperscalers, a lot can change before your contract comes up for renewal. Taking the time to be aware of the latest AI solutions and how they might support your future initiatives and their associated costs will help your organization mitigate risk and be better prepared to evaluate your relationship with the cloud providers.

Knowing how to properly invest in the hyperscalers while balancing the demand for AI technology is no easy task. Learn how UpperEdge can help your organization reach their goals by exploring our Cloud Commercial Advisory Services

Related Blogs