EVOR: Evolving Retrieval for Code Generation
Hongjin Su, Shuyang Jiang, Yuhang Lai, Haoyuan Wu, Boao Shi, Che Liu,, Qian Liu, Tao Yu

TL;DR
EVOR introduces a dynamic retrieval pipeline that evolves both queries and knowledge bases in real-time, significantly improving code generation accuracy over static methods by leveraging diverse, updated information sources.
Contribution
The paper presents EVOR, a novel approach that synchronously evolves queries and knowledge bases, enhancing retrieval-augmented code generation beyond static knowledge approaches.
Findings
EVOR achieves 2-4x higher accuracy than existing methods.
Synchronous evolution of queries and knowledge bases improves performance.
Diverse information sources contribute to better code generation results.
Abstract
Recently the retrieval-augmented generation (RAG) has been successfully applied in code generation. However, existing pipelines for retrieval-augmented code generation (RACG) employ static knowledge bases with a single source, limiting the adaptation capabilities of Large Language Models (LLMs) to domains they have insufficient knowledge of. In this work, we develop a novel pipeline, EVOR, that employs the synchronous evolution of both queries and diverse knowledge bases. On two realistic settings where the external knowledge is required to solve code generation tasks, we compile four new datasets associated with frequently updated libraries and long-tail programming languages, named EVOR-BENCH. Extensive experiments demonstrate that EVOR achieves two to four times of execution accuracy compared to other methods such as Reflexion (Shinn et al., 2024), DocPrompting (Zhou et al., 2023),…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Speech and dialogue systems · Engineering and Information Technology
