Score: 0

Enhancing Cluster Scheduling in HPC: A Continuous Transfer Learning for Real-Time Optimization

Published: September 22, 2025 | arXiv ID: 2509.22701v1

By: Leszek Sliwko, Jolanta Mizera-Pietraszko

Potential Business Impact:

Makes computer jobs run faster and smarter.

Business Areas:

Scheduling Information Technology, Software

This study presents a machine learning-assisted approach to optimize task scheduling in cluster systems, focusing on node-affinity constraints. Traditional schedulers like Kubernetes struggle with real-time adaptability, whereas the proposed continuous transfer learning model evolves dynamically during operations, minimizing retraining needs. Evaluated on Google Cluster Data, the model achieves over 99% accuracy, reducing computational overhead and improving scheduling latency for constrained tasks. This scalable solution enables real-time optimization, advancing machine learning integration in cluster management and paving the way for future adaptive scheduling strategies.

Learning to Schedule: A Supervised Learning Framework for Network-Aware Scheduling of Data-Intensive Workloads

Distributed, Parallel, and Cluster Computing

Makes computer jobs run faster by predicting delays.

24 Oct 2025 0

88%

Hybrid Learning and Optimization-Based Dynamic Scheduling for DL Workloads on Heterogeneous GPU Clusters

Distributed, Parallel, and Cluster Computing

Makes computer jobs run faster and use less power.

11 Dec 2025 1

88%

Adaptive Job Scheduling in Quantum Clouds Using Reinforcement Learning

Distributed, Parallel, and Cluster Computing

Lets many small quantum computers work together.

12 Jun 2025 0

View PDF Login to Bookmark

Page Count

11 pages

Enhancing Cluster Scheduling in HPC: A Continuous Transfer Learning for Real-Time Optimization

Makes computer jobs run faster and smarter.

Technical Abstract

Learning to Schedule: A Supervised Learning Framework for Network-Aware Scheduling of Data-Intensive Workloads

Hybrid Learning and Optimization-Based Dynamic Scheduling for DL Workloads on Heterogeneous GPU Clusters

Adaptive Job Scheduling in Quantum Clouds Using Reinforcement Learning