Autonomous Resource Management in Microservice Systems via Reinforcement Learning
By: Yujun Zou , Nia Qi , Yingnan Deng and more
Potential Business Impact:
Makes computer programs run faster and cheaper.
This paper proposes a reinforcement learning-based method for microservice resource scheduling and optimization, aiming to address issues such as uneven resource allocation, high latency, and insufficient throughput in traditional microservice architectures. In microservice systems, as the number of services and the load increase, efficiently scheduling and allocating resources such as computing power, memory, and storage becomes a critical research challenge. To address this, the paper employs an intelligent scheduling algorithm based on reinforcement learning. Through the interaction between the agent and the environment, the resource allocation strategy is continuously optimized. In the experiments, the paper considers different resource conditions and load scenarios, evaluating the proposed method across multiple dimensions, including response time, throughput, resource utilization, and cost efficiency. The experimental results show that the reinforcement learning-based scheduling method significantly improves system response speed and throughput under low load and high concurrency conditions, while also optimizing resource utilization and reducing energy consumption. Under multi-dimensional resource conditions, the proposed method can consider multiple objectives and achieve optimized resource scheduling. Compared to traditional static resource allocation methods, the reinforcement learning model demonstrates stronger adaptability and optimization capability. It can adjust resource allocation strategies in real time, thereby maintaining good system performance in dynamically changing load and resource environments.
Similar Papers
Multi-Agent Reinforcement Learning for Adaptive Resource Orchestration in Cloud-Native Clusters
Machine Learning (CS)
Makes computer databases run faster and smoother.
Deep Reinforcement Learning for Job Scheduling and Resource Management in Cloud Computing: An Algorithm-Level Review
Distributed, Parallel, and Cluster Computing
Teaches computers to manage cloud jobs better.
A Reinforcement Learning-Driven Task Scheduling Algorithm for Multi-Tenant Distributed Systems
Distributed, Parallel, and Cluster Computing
Makes computers share resources fairly and fast.