The High Cost of Keeping Warm: Characterizing Overhead in Serverless Autoscaling Policies
By: Leonid Kondrashov , Boxi Zhou , Hancheng Wang and more
Potential Business Impact:
Makes cloud apps run faster and cheaper.
Serverless computing is transforming cloud application development, but the performance-cost trade-offs of control plane designs remain poorly understood due to a lack of open, cross-platform benchmarks and detailed system analyses. In this work, we address these gaps by designing a serverless system that approximates the scaling behaviors of commercial providers, including AWS Lambda and Google Cloud Run. We systematically compare the performance and cost-efficiency of both synchronous and asynchronous autoscaling policies by replaying real-world workloads and varying key autoscaling parameters. We demonstrate that our open-source systems can closely replicate the operational characteristics of commercial platforms, enabling reproducible and transparent experimentation. By evaluating how autoscaling parameters affect latency, memory usage, and CPU overhead, we reveal several key findings. First, we find that serverless systems exhibit significant computational overhead due to instance churn equivalent to 10-40% of the CPU cycles spent on request handling, primarily originating from worker nodes. Second, we observe high memory allocation due to scaling policy: 2-10 times more than actively used. Finally, we demonstrate that reducing these overheads typically results in significant performance degradation in the current systems, underscoring the need for new, cost-efficient autoscaling strategies. Additionally, we employ a hybrid methodology that combines real control plane deployments with large-scale simulation to extend our evaluation closer to a production scale, thereby bridging the gap between small research clusters and real-world environments.
Similar Papers
Demystifying Serverless Costs on Public Platforms: Bridging Billing, Architecture, and OS Scheduling
Distributed, Parallel, and Cluster Computing
Makes cloud computing cheaper by fixing hidden costs.
Getting to the Bottom of Serverless Billing
Distributed, Parallel, and Cluster Computing
Saves money on cloud computer services.
Towards Energy-Efficient Serverless Computing with Hardware Isolation
Distributed, Parallel, and Cluster Computing
Saves energy by giving each task its own tiny computer.