Explaining Code Risk in OSS: Towards LLM-Generated Fault Prediction Interpretations
By: Elijah Kayode Adejumo, Brittany Johnson
Potential Business Impact:
Helps coders fix bugs by explaining code risks.
Open Source Software (OSS) has become a very important and crucial infrastructure worldwide because of the value it provides. OSS typically depends on contributions from developers across diverse backgrounds and levels of experience. Making safe changes, such as fixing a bug or implementing a new feature, can be challenging, especially in object-oriented systems where components are interdependent. Static analysis and defect-prediction tools produce metrics (e.g., complexity,coupling) that flag potentially fault-prone components, but these signals are often hard for contributors new or unfamiliar with the codebase to interpret. Large Language Models (LLMs) have shown strong performance on software engineering tasks such as code summarization and documentation generation. Building on this progress, we investigate whether LLMs can translate fault-prediction metrics into clear, human-readable risk explanations and actionable guidance to help OSS contributors plan and review code modifications. We outline explanation types that an LLM-generated assistant could provide (descriptive, contextual, and actionable explanations). We also outline our next steps to assess usefulness through a task-based study with OSS contributors, comparing metric-only baselines to LLM-generated explanations on decision quality, time-to-completion, and error rates
Similar Papers
Exploring the Potential and Limitations of Large Language Models for Novice Program Fault Localization
Software Engineering
Helps new coders find mistakes in their programs.
LLM-based Vulnerability Discovery through the Lens of Code Metrics
Cryptography and Security
Finds computer bugs by looking at code patterns.
AutoEmpirical: LLM-Based Automated Research for Empirical Software Fault Analysis
Software Engineering
Finds software problems much faster than people.