Case Studies

Outcomes, not
PowerPoints.

Anonymized write-ups of real InfraZen engagements — the architecture, the intervention, and the numbers that changed.

Cutting a Middle-East FinTech's AWS bill by 47%.

The situation: A Series B payments FinTech based in Dubai was burning through its AWS budget on over-provisioned RDS instances, idle NAT gateway traffic, and a sprawling collection of forgotten dev environments. Cloud spend had grown 3x faster than revenue for two consecutive quarters.

What we did: A 6-week FinOps engagement. We ran a full cost & usage analysis, introduced mandatory tagging, consolidated NAT traffic through VPC endpoints, right-sized RDS with Graviton and Reserved Instances, and built an automated teardown pipeline for non-production environments.

The result:

  • 47% monthly AWS bill reduction — sustained for 6+ months
  • $312K annualized savings on compute, storage, and egress
  • Zero production incidents during the right-sizing
  • Full runbooks and dashboards handed off to the internal platform team

Engagement type: Project Delivery · 6 weeks · Fixed-scope SoW

From 97% to 99.98% uptime in one quarter.

The situation: A Series A US-based SaaS company in the HR-tech space was dealing with 2–3 major incidents per month, eroding enterprise renewals. No SLOs, no error budgets, no postmortem discipline. On-call was informal and burning out their lead engineer.

What we did: A 12-week SRE program bootstrap. We defined user-journey-based SLOs with real error budgets, rolled out Grafana + Prometheus observability, wrote runbooks for the top 15 incident classes, instituted blameless postmortems, and paired with their engineers on 4 real incidents to build muscle memory.

The result:

  • 99.98% uptime over the following quarter (from 97.1%)
  • MTTR dropped from 84 minutes to 11 minutes
  • Zero pages outside of business hours for the lead engineer
  • Two enterprise renewals saved that were at risk at engagement start

Engagement type: Project Delivery · 12 weeks · Converted to Managed SRE retainer

Kubernetes migration with zero downtime.

The situation: A large Indian e-commerce marketplace running a monolithic PHP stack on EC2 needed to move to containerized microservices on EKS ahead of their biggest sale of the year. Previous attempts with another vendor had failed twice due to stateful session issues.

What we did: A 16-week Project Delivery engagement. We migrated 47 services in waves, using ArgoCD for GitOps delivery, Istio for traffic shifting, and external Redis for session state. We ran load tests at 3x peak traffic before every wave, and kept the old EC2 fleet warm for instant rollback.

The result:

  • 47 services migrated across 8 waves
  • Zero customer-facing downtime across the entire migration
  • Deploy frequency 14x — from 2/week to 4/day
  • Handled 5.2x peak traffic on the sale day without intervention

Engagement type: Project Delivery · 16 weeks · Fixed-scope SoW with milestone billing

Note for the team: The three case studies above are illustrative templates based on typical InfraZen engagement shapes. Replace with real client outcomes (anonymized as needed) and exact numbers. Keep the format: Situation / What we did / Result (bulleted metrics) / Engagement type.

Want this to be your story?

Book a 30-minute architecture review. We'll look at your stack and tell you honestly where the biggest wins are.

Book a Free Review