Code Search based on Context-aware Code Translation

Weisong Sun; Chunrong Fang; Yuchen Chen; Guanhong Tao and; Tingxu Han; Quanjun Zhang

arXiv:2202.08029·cs.SE·February 17, 2022

Code Search based on Context-aware Code Translation

Weisong Sun, Chunrong Fang, Yuchen Chen, Guanhong Tao and, Tingxu Han, Quanjun Zhang

PDF

1 Repo

TL;DR

This paper introduces TranCS, a novel code search method that translates code snippets into natural language using context-aware execution simulation, significantly improving retrieval accuracy over existing techniques.

Contribution

The paper proposes a new context-aware code translation approach with a shared embedding space, enhancing semantic matching in code search tasks.

Findings

01

TranCS outperforms state-of-the-art methods by up to 66.50% in MRR.

02

Using execution simulation improves semantic understanding of code snippets.

03

Shared vocabulary for embeddings reduces divergence between query and code representations.

Abstract

Code search is a widely used technique by developers during software development. It provides semantically similar implementations from a large code corpus to developers based on their queries. Existing techniques leverage deep learning models to construct embedding representations for code snippets and queries, respectively. Features such as abstract syntactic trees, control flow graphs, etc., are commonly employed for representing the semantics of code snippets. However, the same structure of these features does not necessarily denote the same semantics of code snippets, and vice versa. In addition, these techniques utilize multiple different word mapping functions that map query words/code tokens to embedding representations. This causes diverged embeddings of the same word/token in queries and code snippets. We propose a novel context-aware code translation technique that translates…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

wssun/trancs
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.