Discerning Legitimate Failures From False Alerts: A Study of Chromium's   Continuous Integration

Guillaume Haben; Sarra Habchi; Mike Papadakis; Maxime Cordy; Yves; Le Traon

arXiv:2111.03382·cs.SE·November 8, 2021

Discerning Legitimate Failures From False Alerts: A Study of Chromium's Continuous Integration

Guillaume Haben, Sarra Habchi, Mike Papadakis, Maxime Cordy, Yves, Le Traon

PDF

Open Access 1 Repo

TL;DR

This paper introduces Fair, a machine learning-based approach to distinguish legitimate test failures from false alerts in continuous integration, reducing unnecessary reruns and saving computational resources.

Contribution

The paper presents a novel lightweight classifier, Fair, that accurately differentiates false alerts from legitimate failures using features from test artifacts in Chromium's CI system.

Findings

01

Fair achieves up to 95% MCC in classification accuracy.

02

Fair reduces rerun costs, saving up to 20 minutes per build.

03

Effective across different test categories with limited failure data.

Abstract

Flakiness is a major concern in Software testing. Flaky tests pass and fail for the same version of a program and mislead developers who spend time and resources investigating test failures only to discover that they are false alerts. In practice, the defacto approach to address this concern is to rerun failing tests hoping that they would pass and manifest as false alerts. Nonetheless, completely filtering out false alerts may require a disproportionate number of reruns, and thus incurs important costs both computation and time-wise. As an alternative to reruns, we propose Fair, a novel, lightweight approach that classifies test failures into false alerts and legitimate failures. Fair relies on a classifier and a set of features from the failures and test artefacts. To build and evaluate our machine learning classifier, we use the continuous integration of the Chromium project. In…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

guillaumehaben/fair-replicationpackage
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSoftware Testing and Debugging Techniques · Software Engineering Research · Advanced Malware Detection Techniques