Score: 0

Evaluating Large Language Models for Workload Mapping and Scheduling in Heterogeneous HPC Systems

Published: November 4, 2025 | arXiv ID: 2511.11612v1

By: Aasish Kumar Sharma, Julian Kunkel

Potential Business Impact:

Lets computers solve hard scheduling puzzles from words.

Business Areas:

Natural Language Processing Artificial Intelligence, Data and Analytics, Software

Large language models (LLMs) are increasingly explored for their reasoning capabilities, yet their ability to perform structured, constraint-based optimization from natural language remains insufficiently understood. This study evaluates twenty-one publicly available LLMs on a representative heterogeneous high-performance computing (HPC) workload mapping and scheduling problem. Each model received the same textual description of system nodes, task requirements, and scheduling constraints, and was required to assign tasks to nodes, compute the total makespan, and explain its reasoning. A manually derived analytical optimum of nine hours and twenty seconds served as the ground truth reference. Three models exactly reproduced the analytical optimum while satisfying all constraints, twelve achieved near-optimal results within two minutes of the reference, and six produced suboptimal schedules with arithmetic or dependency errors. All models generated feasible task-to-node mappings, though only about half maintained strict constraint adherence. Nineteen models produced partially executable verification code, and eighteen provided coherent step-by-step reasoning, demonstrating strong interpretability even when logical errors occurred. Overall, the results define the current capability boundary of LLM reasoning in combinatorial optimization: leading models can reconstruct optimal schedules directly from natural language, but most still struggle with precise timing, data transfer arithmetic, and dependency enforcement. These findings highlight the potential of LLMs as explainable co-pilots for optimization and decision-support tasks rather than autonomous solvers.

Evaluating the Efficacy of LLM-Based Reasoning for Multiobjective HPC Job Scheduling

Distributed, Parallel, and Cluster Computing

Makes computers finish jobs faster and fairer.

29 May 2025 1

92%

Do Large Language Models Understand Performance Optimization?

Distributed, Parallel, and Cluster Computing

Computers write faster, but sometimes make mistakes.

17 Mar 2025 2

91%

Cross-Task Benchmarking and Evaluation of General-Purpose and Code-Specific Large Language Models

Software Engineering

Makes computers better at understanding language and code.

4 Dec 2025 1

View PDF Login to Bookmark

Page Count

14 pages

Evaluating Large Language Models for Workload Mapping and Scheduling in Heterogeneous HPC Systems

Lets computers solve hard scheduling puzzles from words.

Technical Abstract

Evaluating the Efficacy of LLM-Based Reasoning for Multiobjective HPC Job Scheduling

Do Large Language Models Understand Performance Optimization?

Cross-Task Benchmarking and Evaluation of General-Purpose and Code-Specific Large Language Models