GitFarm: Git as a Service for Large-Scale Monorepos
Preetam Dwivedi, Akshay Hacholli, Adam Bettigole

TL;DR
GitFarm is a platform that offers Git as a service for large monorepos, significantly reducing clone times, client overhead, and server load by executing Git operations remotely within secure sandboxes.
Contribution
We introduce GitFarm, a novel system that decouples repository management from clients, enabling fast, scalable, and secure Git operations for large-scale monorepos.
Findings
Provides less than 1 second checkout time
Reduces client compute and I/O overhead substantially
Achieves performance and cost benefits over traditional Git workflows
Abstract
At the scale of Uber's monorepos, traditional Git workflows become a fundamental bottleneck. Cloning multi-gigabyte repositories, maintaining local checkouts, periodically syncing from upstream, and executing repetitive fetch or push operations consume substantial compute and I/O across hundreds of automation systems. Although CI (Continuous Integration) systems such as Jenkins and Buildkite provide caching mechanisms to reduce clone times, in practice, these approaches incur significant infrastructure overhead, manual maintenance, inconsistent cache hit rates, and cold start latencies of several minutes for large monorepos. Moreover, thousands of independent clone and fetch operations add heavy load on upstream Git servers, making them slow and difficult to scale. To address these limitations, we present GitFarm, a platform that provides Git as a stateful, identity-scoped,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
