MultiKernelBench: A Multi-Platform Benchmark for Kernel Generation
By: Zhongzhen Wen , Yinghui Zhang , Zhong Li and more
Potential Business Impact:
Helps AI build faster computer programs for different chips.
The automatic generation of deep learning (DL) kernels using large language models (LLMs) has emerged as a promising approach to reduce the manual effort and hardware-specific expertise required for writing high-performance operator implementations. However, existing benchmarks for evaluating LLMs in this domain suffer from limited hardware support, coarse-grained kernel categorization, and imbalanced task coverage. To address these limitations, we introduce MultiKernelBench, the first comprehensive, multi-platform benchmark for LLM-based DL kernel generation. MultiKernelBench spans 285 tasks across 14 well-defined kernel categories and supports three major hardware platforms: Nvidia GPUs, Huawei NPUs, and Google TPUs. To enable future extensibility, we design a modular backend abstraction layer that decouples platform-specific logic from the core benchmarking infrastructure, allowing easy integration of new hardware platforms. We further propose a simple yet effective category-aware one-shot prompting method that improves generation quality by providing in-category exemplars. Through systematic evaluations of seven state-of-the-art LLMs, we reveal significant variation in task difficulty, poor generalization to platforms with less training exposure, and the effectiveness of targeted prompting strategies. MultiKernelBench is publicly available at https://github.com/wzzll123/MultiKernelBench.
Similar Papers
MultiKernelBench: A Multi-Platform Benchmark for Kernel Generation
Distributed, Parallel, and Cluster Computing
Helps AI write code for faster computer chips.
KernelBench: Can LLMs Write Efficient GPU Kernels?
Machine Learning (CS)
Helps computers write faster code for AI.
Towards Robust Agentic CUDA Kernel Benchmarking, Verification, and Optimization
Software Engineering
Makes computer programs run much faster.