Mapping Language to Code in Programmatic Context
Srinivasan Iyer, Ioannis Konstas, Alvin Cheung, Luke Zettlemoyer

TL;DR
This paper introduces a new task of generating Java class member functions from English documentation and class context, along with a large dataset and a specialized model to address this challenge.
Contribution
The paper presents the CONCODE dataset and a novel encoder-decoder model that jointly considers documentation and class context for code generation.
Findings
The dataset contains over 100,000 Java class examples.
The proposed model improves code generation accuracy over baselines.
Error analysis highlights future research directions.
Abstract
Source code is rarely written in isolation. It depends significantly on the programmatic context, such as the class that the code would reside in. To study this phenomenon, we introduce the task of generating class member functions given English documentation and the programmatic context provided by the rest of the class. This task is challenging because the desired code can vary greatly depending on the functionality the class provides (e.g., a sort function may or may not be available when we are asked to "return the smallest element" in a particular member variable list). We introduce CONCODE, a new large dataset with over 100,000 examples consisting of Java classes from online code repositories, and develop a new encoder-decoder architecture that models the interaction between the method documentation and the class environment. We also present a detailed error analysis suggesting…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
