When More Retrieval Hurts: Retrieval-Augmented Code Review Generation
Qianru Meng, Xiao Zhang, Zhaochen Ren, Joost Visser

TL;DR
This paper introduces RARe, a retrieval-augmented framework for code review generation that leverages relevant historical reviews to improve output quality, but finds that excessive retrieval can negatively impact performance.
Contribution
The paper proposes RARe, a novel retrieval-augmented approach for code review generation that effectively incorporates historical reviews as in-context examples for large language models.
Findings
RARe outperforms strong baselines on public benchmarks.
Using only the top-1 retrieved example yields the best results.
More retrieval examples can degrade performance due to redundancy and conflicting cues.
Abstract
Code review generation can reduce developer effort by producing concise, reviewer-style feedback for a given code snippet or code change. However, generation-only models often produce generic or off-point reviews, while retrieval-only methods struggle to adapt well to new contexts. In this paper, we view retrieval augmentation for code review as retrieval-augmented in-context learning, where retrieved historical reviews are placed in the input context as examples that guide the model's output. Based on this view, we propose RARe (Retrieval-Augmented Code Reviewer), a framework that retrieves relevant historical reviews from a corpus and conditions a large language model on the retrieved in-context examples. Experiments on two public benchmarks show that RARe outperforms strong baselines and reaches BLEU-4 scores of 12.32 and 12.96. A key finding is that more retrieval can hurt: using…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Software Testing and Debugging Techniques · Topic Modeling
