Hierarchical Learning of Cross-Language Mappings through Distributed   Vector Representations for Code

Nghi D. Q. Bui; Lingxiao Jiang

arXiv:1803.04715·cs.LG·March 14, 2018

Hierarchical Learning of Cross-Language Mappings through Distributed Vector Representations for Code

Nghi D. Q. Bui, Lingxiao Jiang

PDF

1 Repo

TL;DR

This paper introduces a hierarchical approach to learn cross-language code representations by enriching token streams with structural info, enabling automatic program translation between languages like Java and C#.

Contribution

It proposes a novel hierarchical method to learn shared embeddings for code elements across languages, improving cross-language mapping accuracy.

Findings

01

Successfully learned shared embeddings for code elements in Java and C#

02

Achieved reasonable MAP scores in cross-language mappings

03

Outperformed existing tools in API method mapping accuracy

Abstract

Translating a program written in one programming language to another can be useful for software development tasks that need functionality implementations in different languages. Although past studies have considered this problem, they may be either specific to the language grammars, or specific to certain kinds of code elements (e.g., tokens, phrases, API uses). This paper proposes a new approach to automatically learn cross-language representations for various kinds of structural code elements that may be used for program translation. Our key idea is two folded: First, we normalize and enrich code token streams with additional structural and semantic information, and train cross-language vector representations for the tokens (a.k.a. shared embeddings based on word2vec, a neural-network-based technique for producing word embeddings; Second, hierarchically from bottom up, we construct…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

bdqnghi/hierarchical-programming-language-mapping
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.