Imitation Game: Reproducing Deep Learning Bugs Leveraging an Intelligent Agent
By: Mehil B Shah, Mohammad Masudur Rahman, Foutse Khomh
Despite their wide adoption in various domains (e.g., healthcare, finance, software engineering), Deep Learning (DL)-based applications suffer from many bugs, failures, and vulnerabilities. Reproducing these bugs is essential for their resolution, but it is extremely challenging due to the inherent nondeterminism of DL models and their tight coupling with hardware and software environments. According to recent studies, only about 3% of DL bugs can be reliably reproduced using manual approaches. To address these challenges, we present RepGen, a novel, automated, and intelligent approach for reproducing deep learning bugs. RepGen constructs a learning-enhanced context from a project, develops a comprehensive plan for bug reproduction, employs an iterative generate-validate-refine mechanism, and thus generates such code using an LLM that reproduces the bug at hand. We evaluate RepGen on 106 real-world deep learning bugs and achieve a reproduction rate of 80.19%, a 19.81% improvement over the state-of-the-art measure. A developer study involving 27 participants shows that RepGen improves the success rate of DL bug reproduction by 23.35%, reduces the time to reproduce by 56.8%, and lowers participants' cognitive load.
Similar Papers
Improving the Reproducibility of Deep Learning Software: An Initial Investigation through a Case Study Analysis
Machine Learning (CS)
Makes computer learning results work again.
BugGen: A Self-Correcting Multi-Agent LLM Pipeline for Realistic RTL Bug Synthesis
Software Engineering
Finds computer chip mistakes much faster.
Reflective Paper-to-Code Reproduction Enabled by Fine-Grained Verification
Software Engineering
Helps computers copy science papers into working code.