Risk-Aware Batch Testing for Performance Regression Detection
Ali Sayedsalehi, Peter C. Rigby, Gregory Mierzwinski

TL;DR
This paper introduces a risk-aware batch testing framework for performance regression detection in CI systems, leveraging machine learning to optimize testing efficiency and timeliness.
Contribution
It unifies regression risk prediction with adaptive batching, demonstrating significant resource savings and improved diagnostic speed in large-scale CI environments.
Findings
Risk-aware batching reduces total test executions by 32.4%.
The approach decreases mean feedback time by 3.8%.
Achieves an estimated annual cost saving of $491K.
Abstract
Performance regression testing is essential in large-scale continuous-integration (CI) systems, yet executing full performance suites for every commit is prohibitively expensive. Prior work on performance regression prediction and batch testing has shown independent benefits, but each faces practical limitations: predictive models are rarely integrated into CI decision-making, and conventional batching strategies ignore commit-level heterogeneity. We unify these strands by introducing a risk-aware framework that integrates machine-learned commit risk with adaptive batching. Using Mozilla Firefox as a case study, we construct a production-derived dataset of human-confirmed regressions aligned chronologically with Autoland, and fine-tune ModernBERT, CodeBERT, and LLaMA-3.1 variants to estimate commit-level performance regression risk, achieving up to 0.694 ROC-AUC with CodeBERT. The…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
