TL;DR
This paper introduces REDCODER, a retrieval-augmented framework that enhances code generation and summarization by retrieving relevant code or summaries, extending dense retrieval techniques to unimodal and bimodal databases, and demonstrating effectiveness on Java and Python datasets.
Contribution
REDCODER is the first framework to extend dense retrieval to unimodal and bimodal code and summary databases for improved code generation and summarization.
Findings
REDCODER outperforms baseline models on Java and Python datasets.
Retrieval augmentation significantly improves code and summary quality.
The framework effectively handles unimodal and bimodal retrieval databases.
Abstract
Software developers write a lot of source code and documentation during software development. Intrinsically, developers often recall parts of source code or code summaries that they had written in the past while implementing software or documenting them. To mimic developers' code or summary generation behavior, we propose a retrieval augmented framework, REDCODER, that retrieves relevant code or summaries from a retrieval database and provides them as a supplement to code generation or summarization models. REDCODER has a couple of uniqueness. First, it extends the state-of-the-art dense retrieval technique to search for relevant code or summaries. Second, it can work with retrieval databases that include unimodal (only code or natural language description) or bimodal instances (code-description pairs). We conduct experiments and extensive analysis on two benchmark datasets of code…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
