RACE: Retrieval-Augmented Commit Message Generation
Ensheng Shi, Yanlin Wang, Wei Tao, Lun Du, Hongyu Zhang, Shi Han,, Dongmei Zhang, Hongbin Sun

TL;DR
RACE is a retrieval-augmented neural method for commit message generation that uses similar past commits as exemplars and guides message generation based on semantic similarity, improving accuracy and outperforming baselines.
Contribution
The paper introduces RACE, a novel retrieval-augmented approach with an exemplar guider for more accurate commit message generation, outperforming existing models.
Findings
RACE outperforms all baseline models in experiments.
RACE enhances existing Seq2Seq models.
The method is effective across five programming languages.
Abstract
Commit messages are important for software development and maintenance. Many neural network-based approaches have been proposed and shown promising results on automatic commit message generation. However, the generated commit messages could be repetitive or redundant. In this paper, we propose RACE, a new retrieval-augmented neural commit message generation method, which treats the retrieved similar commit as an exemplar and leverages it to generate an accurate commit message. As the retrieved commit message may not always accurately describe the content/intent of the current code diff, we also propose an exemplar guider, which learns the semantic similarity between the retrieved and current code diff and then guides the generation of commit message based on the similarity. We conduct extensive experiments on a large public dataset with five programming languages. Experimental results…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Topic Modeling · Web Data Mining and Analysis
MethodsTanh Activation · Sigmoid Activation · Long Short-Term Memory · Sequence to Sequence
