LLMPerf: GPU Performance Modeling meets Large Language Models
By: Khoi N. M. Nguyen , Hoang Duy Nguyen Do , Huyen Thao Le and more
Potential Business Impact:
Lets computers guess how fast programs will run.
Performance modeling, a pivotal domain in program cost analysis, currently relies on manually crafted models constrained by various program and hardware limitations, especially in the intricate landscape of GPGPU. Meanwhile, Large Language Models (LLMs) have demonstrated their effectiveness in addressing diverse programming challenges. Our work establishes a connection between LLMs and performance modeling, employing the LLM as a performance estimator. Through experimental exploration with carefully designed large-scale OpenCL datasets, we highlight the potential capability as well as the main difficulties of using LLMs in handling performance modeling tasks for OpenCL device source programs. As the first study for this line of work, our LLM-based performance model achieves a mean absolute percentage error of $24.25\%$ for a large-scale generated validation set. On a set of publicly available OpenCL programs, our model achieves a mean absolute percentage error of $46.1\%$.
Similar Papers
Can Large Language Models Predict Parallel Code Performance?
Distributed, Parallel, and Cluster Computing
Lets computers guess GPU speed without testing.
Do Large Language Models Understand Performance Optimization?
Distributed, Parallel, and Cluster Computing
Computers write faster, but sometimes make mistakes.
Efficient Fine-Grained GPU Performance Modeling for Distributed Deep Learning of LLM
Distributed, Parallel, and Cluster Computing
Predicts computer learning time without needing supercomputers.