Deep Graph Matching and Searching for Semantic Code Retrieval

Xiang Ling; Lingfei Wu; Saizhuo Wang; Gaoning Pan; Tengfei Ma; Fangli; Xu; Alex X. Liu; Chunming Wu; Shouling Ji

arXiv:2010.12908·cs.AI·June 23, 2021

Deep Graph Matching and Searching for Semantic Code Retrieval

Xiang Ling, Lingfei Wu, Saizhuo Wang, Gaoning Pan, Tengfei Ma, Fangli, Xu, Alex X. Liu, Chunming Wu, Shouling Ji

PDF

TL;DR

This paper introduces DGMS, a deep graph matching model that leverages graph neural networks to improve semantic code retrieval by capturing structural features of both natural language queries and code snippets.

Contribution

The paper proposes an end-to-end graph neural network model that unifies and matches the structural features of queries and code for enhanced retrieval accuracy.

Findings

01

DGMS outperforms state-of-the-art models on Java and Python datasets.

02

Structural graph representations improve code retrieval performance.

03

Ablation studies confirm the effectiveness of each component in DGMS.

Abstract

Code retrieval is to find the code snippet from a large corpus of source code repositories that highly matches the query of natural language description. Recent work mainly uses natural language processing techniques to process both query texts (i.e., human natural language) and code snippets (i.e., machine programming language), however neglecting the deep structured features of query texts and source codes, both of which contain rich semantic information. In this paper, we propose an end-to-end deep graph matching and searching (DGMS) model based on graph neural networks for the task of semantic code retrieval. To this end, we first represent both natural language query texts and programming language code snippets with the unified graph-structured data, and then use the proposed graph matching and searching model to retrieve the best matching code snippet. In particular, DGMS not only…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.