CRaDLe: Deep Code Retrieval Based on Semantic Dependency Learning
Wenchao Gu, Zongjie Li, Cuiyun Gao, Chaozheng Wang, Hongyu Zhang,, Zenglin Xu, Michael R.Lyu

TL;DR
CRaDLe introduces a novel code retrieval method that leverages statement-level semantic dependency learning to improve the accuracy of matching natural language queries with code snippets, outperforming existing approaches.
Contribution
The paper proposes a new neural approach that incorporates statement-level dependency information into code representations for more effective code retrieval.
Findings
Significantly outperforms state-of-the-art methods
Accurately captures code semantics through dependency learning
Effective on real-world datasets
Abstract
Code retrieval is a common practice for programmers to reuse existing code snippets in open-source repositories. Given a user query (i.e., a natural language description), code retrieval aims at searching for the most relevant ones from a set of code snippets. The main challenge of effective code retrieval lies in mitigating the semantic gap between natural language descriptions and code snippets. With the ever-increasing amount of available open-source code, recent studies resort to neural networks to learn the semantic matching relationships between the two sources. The statement-level dependency information, which highlights the dependency relations among the program statements during the execution, reflects the structural importance of one statement in the code, which is favorable for accurately capturing the code semantics but has never been explored for the code retrieval task. In…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
