Learning to Commit: Generating Organic Pull Requests via Online Repository Memory
Mo Li, L.H. Xu, Qitai Tan, Ting Cao, and Yunxin Liu

TL;DR
This paper introduces Learning to Commit, a framework that enhances code generation by leveraging repository-specific change patterns through online memory, resulting in more organic and project-aligned pull requests.
Contribution
It proposes a novel online repository memory approach that captures project-specific coding styles and constraints, improving the realism of generated pull requests.
Findings
Improves code style and API reuse in generated pull requests.
Enhances the plausibility of modifications in future tasks.
Effectively captures project-specific coding patterns.
Abstract
Large language model (LLM)-based coding agents achieve impressive results on controlled benchmarks yet routinely produce pull requests that real maintainers reject. The root cause is not functional incorrectness but a lack of organicity: generated code ignores project-specific conventions, duplicates functionality already provided by internal APIs, and violates implicit architectural constraints accumulated over years of development. Simply exposing an agent to the latest repository snapshot is not enough: the snapshot reveals the final state of the codebase, but not the repository-specific change patterns by which that state was reached. We introduce Learning to Commit, a framework that closes this gap through Online Repository Memory. Given a repository with a strict chronological split, the agent performs supervised contrastive reflection on earlier commits: it blindly attempts to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
