Crash Report Enhancement with Large Language Models: An Empirical Study
By: S M Farah Al Fahim , Md Nakhla Rafi , Zeyang Ma and more
Potential Business Impact:
Helps fix computer bugs faster by explaining problems.
Crash reports are central to software maintenance, yet many lack the diagnostic detail developers need to debug efficiently. We examine whether large language models can enhance crash reports by adding fault locations, root-cause explanations, and repair suggestions. We study two enhancement strategies: Direct-LLM, a single-shot approach that uses stack-trace context, and Agentic-LLM, an iterative approach that explores the repository for additional evidence. On a dataset of 492 real-world crash reports, LLM-enhanced reports improve Top-1 problem-localization accuracy from 10.6% (original reports) to 40.2-43.1%, and produce suggested fixes that closely resemble developer patches (CodeBLEU around 56-57%). Both our manual evaluations and LLM-as-a-judge assessment show that Agentic-LLM delivers stronger root-cause explanations and more actionable repair guidance. A user study with 16 participants further confirms that enhanced reports make crashes easier to understand and resolve, with the largest improvement in repair guidance. These results indicate that supplying LLMs with stack traces and repository code yields enhanced crash reports that are substantially more useful for debugging.
Similar Papers
Large Language Models for Fault Localization: An Empirical Study
Software Engineering
Finds bugs in computer code faster.
Empirical Evaluation of Large Language Models in Automated Program Repair
Software Engineering
Fixes computer code errors faster and better.
Exploring the Potential and Limitations of Large Language Models for Novice Program Fault Localization
Software Engineering
Helps new coders find mistakes in their programs.