Multi-Modal Attention Network Learning for Semantic Source Code   Retrieval

Yao Wan; Jingdong Shu; Yulei Sui; Guandong Xu; Zhou Zhao; Jian Wu and; Philip S. Yu

arXiv:1909.13516·cs.SE·October 1, 2019

Multi-Modal Attention Network Learning for Semantic Source Code Retrieval

Yao Wan, Jingdong Shu, Yulei Sui, Guandong Xu, Zhou Zhao, Jian Wu and, Philip S. Yu

PDF

TL;DR

This paper introduces MMAN, a multi-modal attention network that combines sequential, structural, and graph-based features of source code to improve semantic code retrieval accuracy and interpretability.

Contribution

The paper proposes a novel multi-modal representation and attention fusion mechanism that integrates code tokens, ASTs, and CFGs for enhanced code retrieval.

Findings

01

Outperforms state-of-the-art code retrieval methods.

02

Effectively combines multiple code features for better accuracy.

03

Provides interpretable results through attention weights.

Abstract

Code retrieval techniques and tools have been playing a key role in facilitating software developers to retrieve existing code fragments from available open-source repositories given a user query. Despite the existing efforts in improving the effectiveness of code retrieval, there are still two main issues hindering them from being used to accurately retrieve satisfiable code fragments from large-scale repositories when answering complicated queries. First, the existing approaches only consider shallow features of source code such as method names and code tokens, but ignoring structured features such as abstract syntax trees (ASTs) and control-flow graphs (CFGs) of source code, which contains rich and well-defined semantics of source code. Second, although the deep learning-based approach performs well on the representation of source code, it lacks the explainability, making it hard to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.