Evaluating SZZ Implementations Through a Developer-informed Oracle
Giovanni Rosa, Luca Pascarella, Simone Scalabrino, Rosalia Tufano,, Gabriele Bavota, Michele Lanza, Rocco Oliveto

TL;DR
This paper introduces a developer-informed oracle using NLP to accurately evaluate SZZ bug-inducing change identification methods, addressing limitations of manual and existing evaluation approaches.
Contribution
It proposes a novel methodology combining NLP and manual filtering to create a reliable oracle for assessing SZZ implementations.
Findings
The oracle improves SZZ evaluation accuracy.
Evaluation revealed strengths and weaknesses of different SZZ variants.
Lessons learned guide future SZZ improvements.
Abstract
The SZZ algorithm for identifying bug-inducing changes has been widely used to evaluate defect prediction techniques and to empirically investigate when, how, and by whom bugs are introduced. Over the years, researchers have proposed several heuristics to improve the SZZ accuracy, providing various implementations of SZZ. However, fairly evaluating those implementations on a reliable oracle is an open problem: SZZ evaluations usually rely on (i) the manual analysis of the SZZ output to classify the identified bug-inducing commits as true or false positives; or (ii) a golden set linking bug-fixing and bug-inducing commits. In both cases, these manual evaluations are performed by researchers with limited knowledge of the studied subject systems. Ideally, there should be a golden set created by the original developers of the studied systems. We propose a methodology to build a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Software Reliability and Analysis Research · Advanced Malware Detection Techniques
