CI at Scale: Lean, Green, and Fast
Dhruva Juloori, Zhongpeng Lin, Matthew Williams, Eddy Shin, Sonal Mahajan

TL;DR
This paper presents enhancements to Uber's SubmitQueue system, using machine learning and probabilistic modeling to optimize resource usage and build prioritization, significantly improving CI efficiency in large-scale monorepos.
Contribution
The paper introduces a novel probabilistic and machine learning-based approach to optimize build scheduling and resource utilization in large-scale CI systems.
Findings
53% reduction in CI resource usage
44% decrease in CPU consumption
37% improvement in waiting times
Abstract
Maintaining a "green" mainline branch, where all builds pass successfully, is crucial but challenging in fast-paced, large-scale software development environments, particularly with concurrent code changes in large monorepos. SubmitQueue, a system designed to address these challenges, speculatively executes builds and only lands changes with successful outcomes. However, despite its effectiveness, the system faces inefficiencies in resource utilization, leading to a high rate of premature build aborts and delays in landing smaller changes blocked by larger conflicting ones. This paper introduces enhancements to SubmitQueue, focusing on optimizing resource usage and improving build prioritization. Central to this is our innovative probabilistic model, which distinguishes between changes with shorter and longer build times to prioritize builds for more efficient scheduling. By leveraging…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBig Data and Business Intelligence
