Predictive Test Selection
Mateusz Machalica, Alex Samylkin, Meredith Porth, Satish Chandra

TL;DR
This paper introduces a machine learning-based predictive test selection method for continuous integration at Facebook, significantly reducing testing costs while maintaining high fault detection rates and accounting for flaky tests.
Contribution
It presents a novel predictive test selection strategy learned from historical data, effectively reducing testing infrastructure costs by half without sacrificing fault detection.
Findings
Reduces total testing infrastructure cost by 50%.
Detects over 95% of individual test failures.
Identifies over 99.9% of faulty changes.
Abstract
Change-based testing is a key component of continuous integration at Facebook. However, a large number of tests coupled with a high rate of changes committed to our monolithic repository make it infeasible to run all potentially-impacted tests on each change. We propose a new predictive test selection strategy which selects a subset of tests to exercise for each change submitted to the continuous integration system. The strategy is learned from a large dataset of historical test outcomes using basic machine learning techniques. Deployed in production, the strategy reduces the total infrastructure cost of testing code changes by a factor of two, while guaranteeing that over 95% of individual test failures and over 99.9% of faulty changes are still reported back to developers. The method we present here also accounts for the non-determinism of test outcomes, also known as test flakiness.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
