R2C2-Coder: Enhancing and Benchmarking Real-world Repository-level Code Completion Abilities of Code Large Language Models

Ken Deng; Jiaheng Liu; He Zhu; Congnan Liu; Jingxin Li; Jiakai Wang; Peng Zhao; Chenchen Zhang; Yanan Wu; Xueqiao Yin; Yuanxing Zhang; Zizheng Zhan; Wenbo Su; Bangyu Xiang; Tiezheng Ge; Bo Zheng

arXiv:2406.01359·cs.CL·September 5, 2025·1 cites

R2C2-Coder: Enhancing and Benchmarking Real-world Repository-level Code Completion Abilities of Code Large Language Models

Ken Deng, Jiaheng Liu, He Zhu, Congnan Liu, Jingxin Li, Jiakai Wang, Peng Zhao, Chenchen Zhang, Yanan Wu, Xueqiao Yin, Yuanxing Zhang, Zizheng Zhan, Wenbo Su, Bangyu Xiang, Tiezheng Ge, Bo Zheng

PDF

Open Access

TL;DR

This paper introduces R2C2-Coder, a novel framework with enhanced prompt construction and a comprehensive benchmark to improve and evaluate repository-level code completion capabilities of large language models.

Contribution

The paper presents R2C2-Coder, which includes a new prompt enhancement method and a challenging benchmark to better assess repository-level code completion.

Findings

01

R2C2-Enhance improves prompt quality for code completion.

02

R2C2-Bench provides a more realistic evaluation environment.

03

Extensive experiments show R2C2-Coder's effectiveness.

Abstract

Code completion models have made significant progress in recent years. Recently, repository-level code completion has drawn more attention in modern software development, and several baseline methods and benchmarks have been proposed. However, existing repository-level code completion methods often fall short of fully using the extensive context of a project repository, such as the intricacies of relevant files and class hierarchies. Besides, the existing benchmarks usually focus on limited code completion scenarios, which cannot reflect the repository-level code completion abilities well of existing methods. To address these limitations, we propose the R2C2-Coder to enhance and benchmark the real-world repository-level code completion abilities of code Large Language Models, where the R2C2-Coder includes a code prompt construction method R2C2-Enhance and a well-designed benchmark…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSoftware Testing and Debugging Techniques · Natural Language Processing Techniques · Service-Oriented Architecture and Web Services

MethodsFocus