Instructive Code Retriever: Learn from Large Language Model's Feedback for Code Intelligence Tasks
Jiawei Lu, Haoye Wang, Zhongxin Liu, Keyu Liang, Lingfeng Bao, Xiaohu, Yang

TL;DR
This paper introduces Instructive Code Retriever (ICR), a novel method that learns semantic and structural query representations with LLM feedback to improve example retrieval for code intelligence tasks, significantly boosting performance.
Contribution
The paper proposes ICR, which leverages LLM feedback and a tree-based loss to better understand query semantics and structure, enhancing retrieval effectiveness across multiple code tasks.
Findings
ICR outperforms state-of-the-art retrieval methods.
Achieved 50-90% improvements in BLEU-4 and CodeBLEU scores.
Demonstrated effectiveness on code summarization, synthesis, and bug fixing.
Abstract
Recent studies proposed to leverage large language models (LLMs) with In-Context Learning (ICL) to handle code intelligence tasks without fine-tuning. ICL employs task instructions and a set of examples as demonstrations to guide the model in generating accurate answers without updating its parameters. While ICL has proven effective for code intelligence tasks, its performance heavily relies on the selected examples. Previous work has achieved some success in using BM25 to retrieve examples for code intelligence tasks. However, existing approaches lack the ability to understand the semantic and structural information of queries, resulting in less helpful demonstrations. Moreover, they do not adapt well to the complex and dynamic nature of user queries in diverse domains. In this paper, we introduce a novel approach named Instructive Code Retriever (ICR), which is designed to retrieve…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Software Engineering Research
