A Neural Architecture for Generating Natural Language Descriptions from Source Code Changes
Pablo Loyola, Edison Marrese-Taylor, Yutaka Matsuo

TL;DR
This paper introduces a neural model that automatically generates natural language descriptions of code changes from commit data, effectively capturing semantics across multiple programming languages and project contexts.
Contribution
The paper presents a novel encoder-decoder architecture trained on commit data to generate accurate descriptions of code modifications, including cross-project generalization.
Findings
Effective in generating feasible descriptions
Works across multiple programming languages
Performs well in cross-project scenarios
Abstract
We propose a model to automatically describe changes introduced in the source code of a program using natural language. Our method receives as input a set of code commits, which contains both the modifications and message introduced by an user. These two modalities are used to train an encoder-decoder architecture. We evaluated our approach on twelve real world open source projects from four different programming languages. Quantitative and qualitative results showed that the proposed approach can generate feasible and semantically sound descriptions not only in standard in-project settings, but also in a cross-project setting.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Advanced Malware Detection Techniques · Topic Modeling
