SZZ in the time of Pull Requests
Fernando Petrulio, David Ackermann, Enrico Fregnan, G\"ul Calikli,, Marco Castelluccio, Sylvestre Ledru, Calixte Denizet, Emma Humphries, Alberto, Bacchelli

TL;DR
This paper evaluates the performance of the SZZ algorithm in multi-commit development models, using datasets from existing sources and Mozilla, to improve bug-inducing commit identification and reduce non-relevant commits.
Contribution
It provides an in-depth analysis of SZZ's reliability in multi-commit contexts and proposes methods to detect non-relevant commits before applying SZZ.
Findings
SZZ performance varies in multi-commit models
Non-relevant commits impact SZZ accuracy
Automatic detection of irrelevant commits improves results
Abstract
In the multi-commit development model, programmers complete tasks (e.g., implementing a feature) by organizing their work in several commits and packaging them into a commit-set. Analyzing data from developers using this model can be useful to tackle challenging developers' needs, such as knowing which features introduce a bug as well as assessing the risk of integrating certain features in a release. However, to do so one first needs to identify fix-inducing commit-sets. For such an identification, the SZZ algorithm is the most natural candidate, but its performance has not been evaluated in the multi-commit context yet. In this study, we conduct an in-depth investigation on the reliability and performance of SZZ in the multi-commit model. To obtain a reliable ground truth, we consider an already existing SZZ dataset and adapt it to the multi-commit context. Moreover, we devise a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Software Engineering Techniques and Practices · Web Application Security Vulnerabilities
