Co-Change Graph Entropy: A New Process Metric for Defect Prediction
By: Ethari Hrishikesh , Amit Kumar , Meher Bhardwaj and more
Potential Business Impact:
Finds bugs in computer code better.
Process metrics, valued for their language independence and ease of collection, have been shown to outperform product metrics in defect prediction. Among these, change entropy (Hassan, 2009) is widely used at the file level and has proven highly effective. Additionally, past research suggests that co-change patterns provide valuable insights into software quality. Building on these findings, we introduce Co-Change Graph Entropy, a novel metric that models co-changes as a graph to quantify co-change scattering. Experiments on eight Apache projects reveal a significant correlation between co-change entropy and defect counts at the file level, with a Pearson correlation coefficient of up to 0.54. In filelevel defect classification, replacing change entropy with co-change entropy improves AUROC in 72.5% of cases and MCC in 62.5% across 40 experimental settings (five machine learning classifiers and eight projects), though these improvements are not statistically significant. However, when co-change entropy is combined with change entropy, AUROC improves in 82.5% of cases and MCC in 65%, with statistically significant gains confirmed via the Friedman test followed by the post-hoc Nemenyi test. These results indicate that co-change entropy complements change entropy, significantly enhancing defect classification performance and underscoring its practical importance in defect prediction.
Similar Papers
Change-Point Detection Utilizing Normalized Entropy as a Fundamental Metric
Applications
Finds sudden changes in data patterns.
Information-Theoretic Detection of Unusual Source Code Changes
Software Engineering
Finds weird code changes automatically.
Unveiling Hybrid Cyclomatic Complexity: A Comprehensive Analysis and Evaluation as an Integral Feature in Automatic Defect Prediction Models
Software Engineering
Finds bugs in computer programs faster.