A First Look at Bugs in LLM Inference Engines
By: Mugeng Liu , Siqi Zhong , Weichen Bi and more
Potential Business Impact:
Finds and fixes bugs in AI language tools.
Large language model-specific inference engines (in short as \emph{LLM inference engines}) have become a fundamental component of modern AI infrastructure, enabling the deployment of LLM-powered applications (LLM apps) across cloud and local devices. Despite their critical role, LLM inference engines are prone to bugs due to the immense resource demands of LLMs and the complexities of cross-platform compatibility. However, a systematic understanding of these bugs remains lacking. To bridge this gap, we present the first empirical study on bugs in LLM inference engines. We mine official repositories of 5 widely adopted LLM inference engines, constructing a comprehensive dataset of 929 real-world bugs. Through a rigorous open coding process, we analyze these bugs to uncover their symptoms, root causes, and commonality. Our findings reveal six major bug symptoms and a taxonomy of 28 root causes, shedding light on the key challenges in bug detection and location within LLM inference engines. Based on these insights, we propose a series of actionable implications for researchers, inference engine vendors, and LLM app developers.
Similar Papers
A Survey on Inference Engines for Large Language Models: Perspectives on Optimization and Efficiency
Computation and Language
Makes smart computer programs run faster and cheaper.
Towards Understanding Bugs in Distributed Training and Inference Frameworks for Large Language Models
Software Engineering
Finds and fixes bugs in AI training tools.
The Foundation Cracks: A Comprehensive Study on Bugs and Testing Practices in LLM Libraries
Software Engineering
Fixes AI mistakes caused by wrong instructions.