How we slashed our EKS costs by 43% with one simple scheduler tweak 🚀
- AWS EKS costs can escalate due to massive, parallel workloads in life sciences/drug development (e.g., genomic sequencing, molecular modeling).
- Default Kubernetes scheduler uses
leastAllocatedstrategy, spreading pods across many nodes for fairness/high availability. leastAllocatedstrategy causes many partially utilized nodes, preventing autoscalers from scaling down idle nodes, increasing costs.mostAllocatedscheduling strategy "packs" pods onto fewer nodes, maximizing utilization and enabling autoscalers like Karpenter to remove idle nodes.- Switching to
mostAllocatedcan reduce runtime costs significantly (e.g., ~10% in UAT, 43% in PROD environments). - Custom scheduler deployment on AWS EKS requires creating a service account,
ClusterRoleBindings,RoleBinding, aConfigMapwith themostAllocatedscoring strategy, and a deployment with a matching Kubernetes version container image. - Resource weights can prioritize packing of expensive resources (e.g., high weight on GPUs for ML workloads).
- Testing in non-production environments is recommended before full rollout.
- Implementing
mostAllocatedscheduling can dramatically optimize costs by enabling cluster autoscalers to shut down unused nodes.
