A Sketch-Based Neural Model for Generating Commit Messages from Diffs
Nicolae-Teodor Pavel, Traian Rebedea

TL;DR
This paper introduces a sketch-based neural model that significantly improves the generation of descriptive commit messages from code diffs, especially for Java files, by enhancing neural machine translation techniques.
Contribution
It presents a novel sketch-based encoder that outperforms existing methods in translating code diffs into commit messages, with improvements tailored for specific file types.
Findings
Sketch-based encoder outperforms state-of-the-art models
Training on specific file types improves performance
Significant gains for Java source code files
Abstract
Commit messages have an important impact in software development, especially when working in large teams. Multiple developers who have a different style of writing may often be involved in the same project. For this reason, it may be difficult to maintain a strict pattern of writing informative commit messages, with the most frequent issue being that these messages are not descriptive enough. In this paper we apply neural machine translation (NMT) techniques to convert code diffs into commit messages and we present an improved sketch-based encoder for this task. We split the approach into three parts. Firstly, we focus on finding a more suitable NMT baseline for this problem. Secondly, we show that the performance of the NMT models can be improved by training on examples containing a specific file type. Lastly, we introduce a novel sketch-based neural model inspired by recent approaches…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Speech and dialogue systems
