AlignCoder: Aligning Retrieval with Target Intent for Repository-Level Code Completion
Tianyue Jiang, Yanli Wang, Yanlin Wang, Daya Guo, Ensheng Shi, Yuchi Ma, Jiachi Chen, Zibin Zheng

TL;DR
AlignCoder enhances repository-level code completion by improving retrieval accuracy through query enhancement and reinforcement learning, significantly outperforming baselines across multiple benchmarks and models.
Contribution
It introduces a novel query enhancement mechanism and reinforcement learning-based retriever training for better retrieval in code completion.
Findings
Achieves 18.1% improvement in EM score on CrossCodeEval
Demonstrates high generalizability across models and languages
Outperforms existing retrieval methods in code completion tasks
Abstract
Repository-level code completion remains a challenging task for existing code large language models (code LLMs) due to their limited understanding of repository-specific context and domain knowledge. While retrieval-augmented generation (RAG) approaches have shown promise by retrieving relevant code snippets as cross-file context, they suffer from two fundamental problems: misalignment between the query and the target code in the retrieval process, and the inability of existing retrieval methods to effectively utilize the inference information. To address these challenges, we propose AlignCoder, a repository-level code completion framework that introduces a query enhancement mechanism and a reinforcement learning based retriever training method. Our approach generates multiple candidate completions to construct an enhanced query that bridges the semantic gap between the initial query…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Topic Modeling · Software Testing and Debugging Techniques
