Empirical Analysis of Temporal and Spatial Fault Characteristics in Multi-Fault Bug Repositories
By: Dylan Callaghan, Alexandra van der Spuy, Bernd Fischer
Potential Business Impact:
Finds hidden software bugs that last a long time.
Fixing software faults contributes significantly to the cost of software maintenance and evolution. Techniques for reducing these costs require datasets of software faults, as well as an understanding of the faults, for optimal testing and evaluation. In this paper, we present an empirical analysis of the temporal and spatial characteristics of faults existing in 16 open-source Java and Python projects, which form part of the Defects4J and BugsInPy datasets, respectively. Our findings show that many faults in these software systems are long-lived, leading to the majority of software versions having multiple coexisting faults. This is in contrast to the assumptions of the original datasets, where the majority of versions only identify a single fault. In addition, we show that although the faults are found in only a small subset of the systems, these faults are often evenly distributed amongst this subset, leading to relatively few bug hotspots.
Similar Papers
AutoEmpirical: LLM-Based Automated Research for Empirical Software Fault Analysis
Software Engineering
Finds software problems much faster than people.
The Repeat Offenders: Characterizing and Predicting Extremely Bug-Prone Source Methods
Software Engineering
Finds code most likely to have bugs.
How Far Are We? An Empirical Analysis of Current Vulnerability Localization Approaches
Software Engineering
Finds computer bugs faster and more accurately.