Cluster Workload Allocation: A Predictive Approach Leveraging Machine Learning Efficiency
By: Leszek Sliwko
Potential Business Impact:
Helps computers pick the best place for jobs.
This research investigates how Machine Learning (ML) algorithms can assist in workload allocation strategies by detecting tasks with node affinity operators (referred to as constraint operators), which constrain their execution to a limited number of nodes. Using real-world Google Cluster Data (GCD) workload traces and the AGOCS framework, the study extracts node attributes and task constraints, then analyses them to identify suitable node-task pairings. It focuses on tasks that can be executed on either a single node or fewer than a thousand out of 12.5k nodes in the analysed GCD cluster. Task constraint operators are compacted, pre-processed with one-hot encoding, and used as features in a training dataset. Various ML classifiers, including Artificial Neural Networks, K-Nearest Neighbours, Decision Trees, Naive Bayes, Ridge Regression, Adaptive Boosting, and Bagging, are fine-tuned and assessed for accuracy and F1-scores. The final ensemble voting classifier model achieved 98% accuracy and a 1.5-1.8% misclassification rate for tasks with a single suitable node.
Similar Papers
Guiding Application Users via Estimation of Computational Resources for Massively Parallel Chemistry Computations
Machine Learning (CS)
Predicts computer time to save money.
Machine learning-based cloud resource allocation algorithms: a comprehensive comparative review
Distributed, Parallel, and Cluster Computing
Makes computers use cloud power smarter and cheaper.
Enhancing Cluster Scheduling in HPC: A Continuous Transfer Learning for Real-Time Optimization
Distributed, Parallel, and Cluster Computing
Makes computer jobs run faster and smarter.