An SLO Driven and Cost-Aware Autoscaling Framework for Kubernetes
By: Vinoth Punniyamoorthy , Bikesh Kumar , Sumit Saha and more
Potential Business Impact:
Makes computer programs run better and cheaper.
Kubernetes provides native autoscaling mechanisms, including the Horizontal Pod Autoscaler, Vertical Pod Autoscaler, and node-level autoscalers, to enable elastic resource management for cloud-native applications. However, production environments frequently experience Service Level Objective violations and cost inefficiencies due to reactive scaling behavior, limited use of application-level signals, and opaque control logic. This paper investigates how Kubernetes autoscaling can be enhanced using AIOps principles to jointly satisfy SLO and cost constraints under diverse workload patterns without compromising safety or operational transparency. We present a gap-driven analysis of existing autoscaling approaches and propose a safe and explainable multi-signal autoscaling framework that integrates SLO-aware and cost-conscious control with lightweight demand forecasting. Experimental evaluation using representative microservice and event-driven workloads shows that the proposed approach reduces SLO violation duration by up to 31 percent, improves scaling response time by 24 percent, and lowers infrastructure cost by 18 percent compared to default and tuned Kubernetes autoscaling baselines, while maintaining stable and auditable control behavior. These results demonstrate that AIOps-driven, SLO-first autoscaling can significantly improve the reliability, efficiency, and operational trustworthiness of Kubernetes-based cloud platforms.
Similar Papers
A Hybrid Reactive-Proactive Auto-scaling Algorithm for SLA-Constrained Edge Computing
Distributed, Parallel, and Cluster Computing
Keeps apps running smoothly, even with lots of users.
From Models to Operators: Rethinking Autoscaling Granularity for Large Generative Models
Distributed, Parallel, and Cluster Computing
Makes AI models run faster and cheaper.
Auto-scaling Approaches for Cloud-native Applications: A Survey and Taxonomy
Distributed, Parallel, and Cluster Computing
Makes apps run better by guessing what they need.