Small Language Models as Compiler Experts: Auto-Parallelization for Heterogeneous Systems
By: Prathamesh Devadiga
Traditional auto-parallelizing compilers, reliant on rigid heuristics, struggle with the complexity of modern heterogeneous systems. This paper presents a comprehensive evaluation of small (approximately 1B parameter) language-model-driven compiler auto-parallelization. We evaluate three models: gemma3, llama3.2, and qwen2.5, using six reasoning strategies across 11 real-world kernels drawn from scientific computing, graph algorithms, and machine learning. Our system is benchmarked against strong compiler baselines, including LLVM Polly, TVM, and Triton. Across 376 total evaluations, the proposed approach achieves an average speedup of 6.81x and a peak performance of 43.25x on convolution operations. We analyze scalability, verify correctness using multiple sanitizers, and confirm robustness across diverse compilers and hardware platforms. Our results demonstrate that small, efficient language models can serve as powerful reasoning engines for complex compiler optimization tasks.
Similar Papers
Evaluating Large Language Models for Workload Mapping and Scheduling in Heterogeneous HPC Systems
Distributed, Parallel, and Cluster Computing
Lets computers solve hard scheduling puzzles from words.
Automated Design Optimization via Strategic Search with Large Language Models
Machine Learning (CS)
Helps computers design better code faster and cheaper.
Cross-Task Benchmarking and Evaluation of General-Purpose and Code-Specific Large Language Models
Software Engineering
Makes computers better at understanding language and code.