Discerning Legitimate Failures From False Alerts: A Study of Chromium's Continuous Integration
Guillaume Haben, Sarra Habchi, Mike Papadakis, Maxime Cordy, Yves, Le Traon

TL;DR
This paper introduces Fair, a machine learning-based approach to distinguish legitimate test failures from false alerts in continuous integration, reducing unnecessary reruns and saving computational resources.
Contribution
The paper presents a novel lightweight classifier, Fair, that accurately differentiates false alerts from legitimate failures using features from test artifacts in Chromium's CI system.
Findings
Fair achieves up to 95% MCC in classification accuracy.
Fair reduces rerun costs, saving up to 20 minutes per build.
Effective across different test categories with limited failure data.
Abstract
Flakiness is a major concern in Software testing. Flaky tests pass and fail for the same version of a program and mislead developers who spend time and resources investigating test failures only to discover that they are false alerts. In practice, the defacto approach to address this concern is to rerun failing tests hoping that they would pass and manifest as false alerts. Nonetheless, completely filtering out false alerts may require a disproportionate number of reruns, and thus incurs important costs both computation and time-wise. As an alternative to reruns, we propose Fair, a novel, lightweight approach that classifies test failures into false alerts and legitimate failures. Fair relies on a classifier and a set of features from the failures and test artefacts. To build and evaluate our machine learning classifier, we use the continuous integration of the Chromium project. In…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Testing and Debugging Techniques · Software Engineering Research · Advanced Malware Detection Techniques
