Multi-dimensional Autoscaling of Processing Services: A Comparison of Agent-based Methods
By: Boris Sedlak , Alireza Furutanpey , Zihang Wang and more
Potential Business Impact:
Makes computers work better with less power.
Edge computing breaks with traditional autoscaling due to strict resource constraints, thus, motivating more flexible scaling behaviors using multiple elasticity dimensions. This work introduces an agent-based autoscaling framework that dynamically adjusts both hardware resources and internal service configurations to maximize requirements fulfillment in constrained environments. We compare four types of scaling agents: Active Inference, Deep Q Network, Analysis of Structural Knowledge, and Deep Active Inference, using two real-world processing services running in parallel: YOLOv8 for visual recognition and OpenCV for QR code detection. Results show all agents achieve acceptable SLO performance with varying convergence patterns. While the Deep Q Network benefits from pre-training, the structural analysis converges quickly, and the deep active inference agent combines theoretical foundations with practical scalability advantages. Our findings provide evidence for the viability of multi-dimensional agent-based autoscaling for edge environments and encourage future work in this research direction.
Similar Papers
Towards Multi-dimensional Elasticity for Pervasive Stream Processing Services
Performance
Makes smart city services work better with less power.
Scalability Optimization in Cloud-Based AI Inference Services: Strategies for Real-Time Load Balancing and Automated Scaling
Distributed, Parallel, and Cluster Computing
Makes AI services faster and use less power.
Multi-Dimensional Autoscaling of Stream Processing Services on Edge Devices
Distributed, Parallel, and Cluster Computing
Helps small computers run many apps smoothly.