How to optimize cloud costs with Committed Use Discounts for Compute Engine

jtokimit · 11-15-2023 08:12 AM

Authored by Google Cloud Technical Account Managers, Jun (Tokky) Tokimitsu @jtokimit and Chengying Gu @Chengying_Gu.

Whether you’re an early-stage startup or a large enterprise, everyone wants to be smart with cloud cost management. In our experience working side-by-side with our customers, there are steps that organizations, no matter the size, can follow to make sure they’re getting the most out of the cloud.

One such step that can have a significant impact on your bottom line, is taking advantage of cost optimizations for Compute Engine, specifically Committed Use Discounts (CUDs).

If you have any questions, please leave a comment below and someone from the Community or Google Cloud team will be happy to help.

Why should all customers consider Compute Engine cost optimization?

Compute Engine remains at the top of the spending chart for most enterprise customers. The good news is that Compute Engine has many cost optimization techniques and tools for you to choose from, many of which are easy to implement without the need to change code or architecture.

The matrix below outlines different cloud cost optimization techniques for Compute Engine, organized by the amount of savings you can expect and the level of effort required.

Cost Optimization Deep Dive for Compute Engine matrix.png

We've seen many successful examples of customers optimizing their Total Cost of Ownership (TCO) by leveraging Google managed services and serverless offerings to offload operational overhead, allowing more time to focus on the things that matters to the business. There are other options that can be considered quicker wins. As you can see from the matrix above, Committed Use Discounts are a great option with low effort, while achieving high savings.

Let's dive deeper into Committed Use Discounts for Compute Engine cost optimization.

What are Committed Use Discounts (CUDs)?

Committed Use Discounts (CUDs) for Compute Engine offers deep discounts off list pricing or negotiated contractual pricing for your VM instances in exchange for your commitment to use either a minimum level of resources (such as vCPUs, memory, GPUs, Local SSDs, and sole-tenant nodes) in a region or spend a minimum amount, for a specified term of one or three years.

Commitments are purchased and billed monthly for the duration of the commitment term, whether the resources are fully used or not. CUDs will stack with enterprise discounts (for eligible SKUs), meaning CUDs are applied first, then the enterprise discount is applied.

Google Cloud offers two types of Committed Use Discounts:

Resource-based CUDs: Ideal for predictable and steady state resource usage. You receive these CUDs when you purchase a resource-based commitment and commit to use a minimum level of Compute Engine resources in a particular region.
Flexible CUDs (spend-based): Ideal for scenarios where you have more predictable Google Cloud spend. You receive flexible CUDs when you purchase a spend-based (or flexible) commitment for Compute Engine and commit to a minimum amount of hourly spend.

You can purchase both resource-based and flexible commitments to cover Compute Engine resources for projects in your Cloud Billing account. You can use your resource-based commitments to cover your predictable, stable, and region-specific resource usage. You can use the flexible commitments to cover any resource usage that isn't specific to any one machine type or region.

Resource-based CUDs vs Flexible CUDs

	Compute Engine Resource-based CUDs	Compute Engine Flexible CUDs
Scope	Purchased in a project by default Billing account level can be enabled	Only purchased on the billing account level
Purchase unit	Resource based (e.g N2 vCPU, Memory, local SSD, GPU) * CUD for local SSD and GPU need the reservation first Purchased in terms of the underlying resources	Spend based (e.g: $100/hour) vCPU and Memory for N1, N2, N2D, E2, C2, C2D, C3 & C3D (from 2023/10) are supported Purchased in terms of $/hour of equivalent on-demand spend
Discount off on-demand rate	1 year discount up to 37% 3 year discount up to 57%	1 year discount up to 28% 3 year discount up to 46%
Machine family eligibility	Applies to a specific machine family	Applies to most general-purpose and compute-optimized machine families
Regional eligibility	Applies to a specific region	Applies to ALL regions

Resource-based CUDs considerations

Discounts for general-purpose commitments are applied to resources in the following order:

Custom machine types
Sole-tenant nodes
Predefined machine types

When you purchase general-purpose commitments, you pick which machine series the commitment applies to. For example, if you purchase general-purpose E2 commitments, they apply to only E2 machine types. Similarly, if you purchase general-purpose N2, N2D, C3, C3D, Tau T2D, or N1 commitments, the commitments never overlap.

For example, assume you have a region with the following mix:

10 N2 custom machine type vCPUs
30 GB of custom machine type memory
2 n2-standard-4 predefined machine types

You purchase N2 commitments for 15 vCPUs and 13.5 GB of memory for committed use. The committed use discounts would be applied first to the N2 custom machine types, and any remaining discounts would be applied to the N2 predefined machine types. In this case, all 10 vCPUs of the N2 custom machine types would be charged at committed use prices, and 13.5 GB of custom machine type memory would be charged at committed use prices.

Finally, the remaining 5 vCPUs of committed use would apply to 5 random vCPUs across the two n2-standard-4 machine types. Any resources that aren't covered by committed use discounts would qualify for sustained use discounts (SUDs).

Note: CUDs do not apply to preemptible VM instances, F1-micro, and g1-small shared-core machines.

Resource-based CUDs for GPUs and Local SSDs

You can purchase commitments for GPUs or Local SSDs, however you must follow the below requirements:

You must create a reservation that includes either GPUs or Local SSDs for the same amount at the time when you are purchasing your commitment
You must purchase commitments by specific GPU types. For example, you can purchase GPUs for either NVIDIA P100s or NVIDIA V100s, but you can't purchase commitments for NVIDIA P100 GPUs and apply them to other GPU types

There are no additional charges for reserving the resources, and you do not need to commit to vCPUs or memory.

Sharing resource-based CUDs with projects in your Cloud Billing Account

By default, resource-based CUDs are applied to the project where you purchased your resource-based commitments. Discount sharing enables CUDs to be shared across multiple projects linked to a Cloud Billing Account.

The benefit is to minimize overhead. You just need to manage a desired commitment quantity of vCPU and RAM for the entire Billing Account instead of managing commitments individually across all projects. It is useful for projects in Billing Accounts with unpredictable usage individually, but predictable usage in aggregate among projects.

When discount sharing has been enabled, commitment discounts and charges are shared across projects based on each project’s proportional share of the total eligible usage within the Billing Account on a given day. Current active CUDs and future commitment purchases across all projects in a Billing Account will apply to all usage from all projects.

Below is an example how discount sharing with proportional attribution works:

	Project 1	Project 2	Project 3	Total
Committed - cores	100 - 1yr CUD	60 - 3yr CUD	0	160
Usage - cores	50	40	110	200
% of billing acct usage	50/200 = 25%	40/200 = 20%	110/200 = 55%
Cores covered by CUD	min(160*25%, 50) = 40	min(160*20%, 40) = 32	min(160*55%, 110) = 88
Attributed CUD charge (cores)	1 yr: 10025% = 25 3 yr: 6025% = 15	1 yr: 10020% = 20 3 yr: 6020% = 12	1 yr: 10055% = 55 3 yr: 6055% = 33

Order of discounts applied in billing

To optimize the use of your CUDs, Compute Engine first applies all the resource-based CUDs to any eligible hourly usage. Compute Engine then applies the available flexible CUDs to the remaining eligible on-demand usage that was not covered by any resource-based CUDs. Any hourly usage overage or usage that is not covered by your commitments is charged based on the on-demand rates and is eligible for any applicable SUDs. At any given point, a resource is eligible for only one kind of discount.

Apply resource-based CUDs to eligible usage (machine family- and region-specific)
On-demand usage that wasn’t discount by standard CUDs can be eligible for flexible CUDs
The remaining usage not covered by any CUDs is eligible for SUDs

You can apply a mix of resource-based and flexible CUDs to maximize savings, as shown in the example scenario below.

Understand your bill with CUDs

The CUD billing model is based on debits and credits. Projects are debited usage with on-demand pricing, as well as the cost of CUD commitments being purchased. CUD credits will offset on-demand charges depending on the utilization percentage.

To understand CUDs usage patterns and the different utilization scenarios, think of CUDs as a special discount for a yearly TV subscription service in exchange for committing to the subscription for a set period of time.

As part of the subscription package, you get 120 hrs each month at 60% discount of the hourly rate (regular hourly rate is $1)
However, a daily cap of hours streamed at the discounted rate is imposed to 120 hrs / 30 days = 4 hrs/day. On any day, every microsecond of streaming counts towards the daily cap of the discounted hourly rate. In this case, you're billed $0.40/hr.
Once the daily cap is reached, every additional hour is billed at a regular hourly rate of $1. When the daily cap is not fully utilized on any particular day, the unutilized hours do not carry over to the next day.

In CUD terms, you must commit cores per month and there is a daily cap that you can consume in 24 hrs for a discounted rate. When CUD utilization reaches 100%, CUD credits will offset the bills for on demand usage 100%.

When utilization exceeds 100%, the remaining usage will be charged as on-demand pricing. However if CUD is under utilization, only partial credits will be given.

Below is an example of flexible Compute Engine CUD bills:

	100% CUD utilization	Exceeds 100% CUD utilization	Less than 100% CUD utilization
CUDs purchased for 1 yr	50/hr	40/hr	60/hr
Commitment fee - 28% discount (+)	$36	$28.8	$43.20
*On-demand cost charge (+)	$50	$50	$50
CUDs credit (-)	$50	$40	$50
Total	$36	$38.8	$43.2

*Assume the current on-demand rate is $50/hr.

Total billing is the total of the blue committed portion, the normal usage portion in red, and the credits amount in green.

Committed Use Discounts tips and best practices

The committed use discount analysis report　helps you visualize and understand the effectiveness and financial impact of the CUDs you have purchased. It provides you with total CUD saving in the monthly billing cycle, provides you insights into whether your CUD is being fully utilized, and how much eligible usage can be covered by additional CUDs.

The committed use discount recommender helps you optimize the resource costs of the projects in your Cloud Billing account. CUD recommendations are generated automatically using a formula that analyzes historical and recent usage metrics gathered by Cloud Billing, and includes usage covered by existing commitments. You can apply these recommendations to purchase additional commitments and further optimize your Google Cloud cost.

Once you know how much of your usage is eligible for CUDs, you can start to think about how much of that usage you think you will use in the future. If you have a predictable workload, you may be able to commit to a higher percentage of your usage. However, if your workload is more variable, you may want to commit to a lower percentage of your usage to avoid paying for unused CUDs. This provides a starting point for optimal cost savings and a conservative approach guards against over-purchasing. Consistently review commitments purchased, eligible usage coverages, and commitments utilization to help you identify eligible usage for additional CUD coverage.

Reference links and additional resources

This article is based on recent sessions from the Technical Account Management webinar series. You can see the recordings in the following links: