Cloud Costs Optimization with Kubernetes & Prometheus

Not a dramatic, one-time invoice — more like a slow leak. Our cloud bill kept rising for no obvious reason. The apps were stable, users were happy… but we knew we were wasting money somewhere. We’re a mid-sized product team running containerized apps, microservices, and multiple clusters (dev/test/prod) — with Kubernetes and Prometheus already in place. The problem? We weren’t using them effectively.

Here’s how we changed that — and cut our monthly cloud bill by 32%.

Step 1: Actually Look at the Metrics with Kubernetes

We finally opened Prometheus and Grafana. Not to check pod status, but to understand our resource usage.

The result? A lot of over-provisioned services. We found apps using <15% of their requested CPU while reserving way more than they needed. Multiply that by 40+ services… and there’s your bill.

Step 2: Set Resource Limits — Like You Mean It

Many deployments had:

No limits at all.
Identical values for requests and limits (which defeats the purpose).

New rule:
If a service uses less than 30% of its requested CPU/memory for a full week, we reduce it by ~25%.

We made changes based on Prometheus data and always kept rollback plans ready. No performance issues, just tighter provisioning.

Step 3: Kill Zombie Workloads

Stale test environments. Forgotten staging deployments. “Temporary” pods that became permanent. Sound familiar?

We flagged workloads inactive for 10+ days — mostly test/staging — and removed them after review.

Result: 2 entire nodes were freed up.

Step 4: Rework HPA (Horizontal Pod Autoscaling)

Autoscaling isn’t magic. Some services scaled up aggressively during short bursts… then stayed scaled up for hours.

We:

Reviewed every HPA policy.
Lowered max replicas where appropriate.
Added cooldowns to avoid sticky spikes.
Replaced autoscaling with fixed replicas for predictable jobs.

This alone reduced unnecessary pod inflation and helped rebalance the node load.

Step 5: Track Node Usage, Not Just Pods

We were only watching pods. But we pay for nodes.

So we built dashboards showing:

Node CPU/memory utilization.
Disk usage and IOPS.
Imbalanced workloads.

We adjusted pod scheduling using affinities, taints, and tolerations — and started moving non-critical services to cheaper spot instances.

Step 6: Set Budgets and Cost Alerts in Kubernetes

We set up simple cost monitoring:

Alert when monthly projected spend exceeds budget.
Alert if new workloads launch without resource limits.

It took less than an hour to set up. Saved thousands.

Cost Optimization with Kubernetes and Prometheus

32% drop in monthly cloud spend
No performance impact
No app refactoring
Better cluster hygiene

We also built a small internal dashboard to show “waste potential” across services, encouraging proactive cleanup.

Conclusions

You don’t need to re-architect your app or change providers. Start by actually looking at your usage data. Most cloud waste is hiding in plain sight.

Prometheus and Kubernetes aren’t just about uptime — they’re your keys to smarter cost control.

This process wasn’t perfect. Some cleanup was manual and messy. But it worked. If your cloud bill keeps creeping up and you “swear nothing changed” — trust me, something did.

Spend one afternoon in Prometheus. You’ll probably find enough inefficiencies to pay for lunch. Every week.

1 Comment

Leave a Reply Cancel reply