Predictive Test Selection

Mateusz Machalica; Alex Samylkin; Meredith Porth; Satish Chandra

arXiv:1810.05286·cs.SE·May 31, 2019

Predictive Test Selection

Mateusz Machalica, Alex Samylkin, Meredith Porth, Satish Chandra

PDF

TL;DR

This paper introduces a machine learning-based predictive test selection method for continuous integration at Facebook, significantly reducing testing costs while maintaining high fault detection rates and accounting for flaky tests.

Contribution

It presents a novel predictive test selection strategy learned from historical data, effectively reducing testing infrastructure costs by half without sacrificing fault detection.

Findings

01

Reduces total testing infrastructure cost by 50%.

02

Detects over 95% of individual test failures.

03

Identifies over 99.9% of faulty changes.

Abstract

Change-based testing is a key component of continuous integration at Facebook. However, a large number of tests coupled with a high rate of changes committed to our monolithic repository make it infeasible to run all potentially-impacted tests on each change. We propose a new predictive test selection strategy which selects a subset of tests to exercise for each change submitted to the continuous integration system. The strategy is learned from a large dataset of historical test outcomes using basic machine learning techniques. Deployed in production, the strategy reduces the total infrastructure cost of testing code changes by a factor of two, while guaranteeing that over 95% of individual test failures and over 99.9% of faulty changes are still reported back to developers. The method we present here also accounts for the non-determinism of test outcomes, also known as test flakiness.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.