Equivalence Checking of ML GPU Kernels
By: Kshitij Dubey , Benjamin Driscoll , Anjiang Wei and more
Potential Business Impact:
Checks if computer code for AI works correctly.
With the rapid progress of deep learning and large language models (LLMs), companies now spend enormous sums executing GPU kernels. These kernels have, therefore, become prime targets for aggressive optimization. Recent efforts increasingly leverage LLMs to generate GPU kernels, but make no formal guarantees about the generated kernels. We present the first equivalence checker for GPU kernels and use it to formally verify the correctness of machine learning (ML) kernels optimized by hand, by LLMs, and by compilers. We show that our equivalence checker is sound and, for a well-defined class of GPU kernels which includes the programs of interest, complete. Our implementation, VOLTA, can verify ML computations such as convolutions, matrix multiplications, and various attention mechanisms.
Similar Papers
Equivalence Checking of ML GPU Kernels
Programming Languages
Checks if computer code for AI works right.
ProofWright: Towards Agentic Formal Verification of CUDA
Software Engineering
Ensures computer code for graphics is correct and safe.
TrainVerify: Equivalence-Based Verification for Distributed LLM Training
Distributed, Parallel, and Cluster Computing
Checks if AI training is done right.