Most cloud bills are 35% over-provisioned. The FinOps Foundation's 2026 State of FinOps report puts the median enterprise above 30% in identifiable waste — and that's after most teams have already done one round of Reserved Instance buying and rightsizing.
The first round of cloud cost optimization is usually easy. The second round is where teams get stuck: dashboards full of recommendations, no clear owner, no time to refactor, and no governance to keep the savings from drifting back. Cloud billing isn't a tooling problem. It's an operating-model problem with a tooling layer attached.
Here's the framework we run for every cloud billing engagement — AWS, Azure, GCP, or all three — and what a 90-day program actually delivers.
The five layers of cloud billing optimization
Cloud cost optimisation breaks cleanly into five layers. Most teams work the top two and leave compound savings on the table at layers three through five.
Layer 1 — Commitments. Reserved Instances (AWS RDS, ElastiCache, OpenSearch, Redshift, DynamoDB), Compute Savings Plans (AWS EC2/Fargate/Lambda), Azure Reservations and Savings Plans for compute, GCP Committed Use Discounts. Done well: 25–55% off list price for stable workloads. Done badly: stranded commitments and inflexibility. The trick is matching commitment shape to actual workload variability — not just blanket-buying 3-year all-upfront.
Layer 2 — Rightsizing. The boring layer. CPU and memory utilisation rarely above 30%. Storage volumes provisioned for peak from three years ago. Snapshots that nobody owns. Idle Elastic IPs. Dev environments that never sleep. Most teams have a tool that flags these. Few teams have the change-management process to actually act on them at scale. Done well, this is another 10–20% off the bill.
Layer 3 — Workload placement. Spot instances for fault-tolerant workloads (CI/CD, batch ML training, dev/test, video transcoding). Cross-region pricing arbitrage. Tiered storage migration (S3 Intelligent-Tiering, Azure Cool/Archive, GCP Autoclass). On AWS GP2 to GP3 migration alone — same performance, 20% cheaper, almost zero risk — is still un-done at most teams two years after GP3 launched. Layer 3 typically adds 8–15% on top of layers 1+2.
Layer 4 — Architecture refactor. Serverless vs containers vs VMs for the same workload. Caching tier sizing. Database engine choice (Aurora vs RDS vs DynamoDB at different scale points). The biggest single line item is usually a database, and the right database choice can move the bill 30–50% by itself for the workloads it suits. This is engineering work, not procurement work, and it's where consulting earns its keep.
Layer 5 — FinOps governance. Anomaly detection on the hourly bill. Showback dashboards by team and product. Chargeback when leadership is ready for it. A weekly bill review that takes 20 minutes, not three hours. Pre-approval guardrails so a junior engineer can't accidentally provision a $40K/month instance. Without governance, layers 1–4 unwind within nine months. Governance is what makes the savings compound.
AWS, Azure, GCP — specific levers, same framework
The framework is provider-agnostic; the levers aren't. A few of the biggest provider-specific moves we see consistently underused:
AWS. Compute Savings Plans (Lambda + Fargate + EC2 in one commitment). EBS GP2→GP3 migration. S3 Intelligent-Tiering on data older than 30 days. CloudFront Price Class match to actual user geography. Aurora I/O-Optimized vs Aurora Standard for high-IOPS workloads. NAT Gateway data processing fees on cross-AZ traffic — one of the biggest "wait, why is that line item so big?" surprises in 2026 audits.
Azure. Reservations + Savings Plans for compute (they stack). Azure Hybrid Benefit if you have eligible Windows Server / SQL Server licenses. Azure Cool and Archive storage migration. Spot VMs for AKS node pools. Azure Advisor's right-size recommendations are actually decent in 2026 — the gap is acting on them. Reserved Capacity for SQL Database and Cosmos DB if your throughput is steady.
GCP. Committed Use Discounts (compute + memory + license-included). Spot VMs for GKE node pools. Sustained Use Discounts auto-apply (don't double-count them when modelling commitments). Autoclass on Cloud Storage. Network Service Tiers — Premium vs Standard tier savings on egress for traffic that doesn't need premium routing.
If you're multi-cloud, the placement question becomes a real lever: route the same workload to the cheapest provider for the specific resource shape. Most teams don't model this seriously because the operational overhead used to outweigh the savings. In 2026, with mature IaC and Terraform Cloud Adapters, the math has flipped on workloads above ~$50K/month.
The 90-day cloud billing engagement
Concrete week-by-week breakdown of how a typical InfraZen cloud cost optimization engagement runs. Adapted for the size of your estate, but the shape holds.
Weeks 1–2 — Bill audit + waste discovery.
- Cost & Usage Report (AWS) / Cost Management (Azure) / Billing Export (GCP) connected to BigQuery / Athena / Synapse for line-level analysis.
- Identify top 20 line items by spend — usually 80% of the bill.
- Tag-coverage audit. If you can't allocate spend to product/team, governance is impossible. Fix this first.
- Deliverable: a written report listing every quantifiable savings opportunity with effort vs payoff scoring.
Weeks 3–4 — Commitments + rightsizing wave 1.
- Buy the right commitment shape for stable workloads (Layer 1).
- Auto-rightsize the obvious cases — idle resources, way-over-provisioned VMs, oversized RDS/SQL instances (Layer 2).
- Stand up anomaly detection so the next surprise gets a Slack alert in hours, not month-end.
- Outcome by week 4: 15–25% bill reduction realised.
Weeks 5–8 — Workload placement + architectural moves.
- Spot adoption for the fault-tolerant tier (CI runners, batch jobs, dev/test).
- Storage tiering migration (S3 Intelligent-Tiering / Azure Cool / GCP Autoclass).
- Top 1–2 architectural refactors with the biggest payoff — often a database engine swap or a cache-tier sizing change (Layers 3–4).
- Outcome by week 8: 25–40% cumulative bill reduction.
Weeks 9–12 — FinOps governance + handover.
- Showback dashboards live for engineering leads.
- Anomaly alert thresholds tuned and on-call rotation set.
- Pre-approval guardrails for high-blast-radius resource provisioning.
- A weekly 20-minute bill-review ritual with a named owner.
- Outcome by week 13: 30–50% sustained bill reduction with the operating model in place to make it stick.
After day 90, you have a choice: bring the function in-house with the runbooks we've written, or stay on a quarterly retainer where we re-audit, model new commitments at renewal points, and pull the next 5–10% out each year.
Frequently asked questions
How much can cloud billing optimization actually save?
On a workload that has never been seriously optimised, 30–50% reduction in the first 90 days is typical. Past that, expect 5–10% YoY savings if FinOps governance is wired in. Teams that have already done basic RIs and rightsizing can usually still pull another 15–25% out by moving to commitment-based pricing on the right services, killing zombie resources, and re-architecting for spot or serverless where it makes sense.
When do I buy Reserved Instances vs Savings Plans?
On AWS, Compute Savings Plans are the default for most teams in 2026 — one commitment covers EC2, Fargate, and Lambda, and it auto-applies to whatever instance family runs. RIs still win for RDS, ElastiCache, OpenSearch, Redshift, and DynamoDB where Savings Plans don't apply. Start with 1-year no-upfront convertible. Don't go 3-year until you have a year of stable utilisation data.
What does cloud billing consulting actually cost?
We price most cloud billing engagements as a fixed fee tied to your monthly cloud spend, with a cap on the total. The honest math: if we can't return at least 5× the engagement fee in first-year savings, we won't take the project. That excludes the open-ended timesheet model that most consulting firms run.
Do you have to be single-cloud to benefit?
No. We work across multi-cloud estates. The optimisation framework is the same; only the specific levers differ. Multi-cloud teams often see bigger gains because consolidation and workload-placement arbitrage become available levers that single-cloud teams don't have.
Cloud bills don't have to hurt every month. We run a one-week cloud bill audit that ends in a written savings report — you decide whether to bring us in for the 90-day engagement that ships those savings into production. Book a free 30-minute cloud billing review.
Related: Cloud Consulting & FinOps services · Kubernetes GPU cost crisis · DevOps Engineering