Leveraging Commit Size Context and Hyper Co-Change Graph Centralities for Defect Prediction
Amit Kumar, Ethari Hrishikesh, Sonali Agarwal

TL;DR
This paper introduces commit size aware process metrics and hypergraph centralities to improve file-level defect prediction, demonstrating enhanced performance across multiple projects and classifiers.
Contribution
It proposes a novel approach combining commit size profiles and hyper co-change graph centralities, capturing higher-order change semantics for defect prediction.
Findings
Replacing scalar process metrics with commit size aware vectors improves prediction accuracy.
Hypergraph centralities effectively quantify size-aware node importance.
The approach outperforms traditional models across nine Apache projects.
Abstract
File-level defect prediction models traditionally rely on product and process metrics. While process metrics effectively complement product metrics, they often overlook commit size the number of files changed per commit despite its strong association with software quality. Network centrality measures on dependency graphs have also proven to be valuable product level indicators. Motivated by this, we first redefine process metrics as commit size aware process metric vectors, transforming conventional scalar measures into 100 dimensional profiles that capture the distribution of changes across commit size strata. We then model change history as a hyper co change graph, where hyperedges naturally encode commit-size semantics. Vector centralities computed on these hypergraphs quantify size-aware node importance for source files. Experiments on nine long-lived Apache projects using five…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
