Commit Messages in the Age of Large Language Models
Cristina V. Lopes, Vanessa I. Klotzman, Iris Ma, Iftekar Ahmed

TL;DR
This paper evaluates ChatGPT's ability to generate high-quality commit messages from code changes, demonstrating it outperforms previous methods significantly and offers insights into its failure cases.
Contribution
It is the first comprehensive assessment of large language models like ChatGPT for automatic commit message generation, showing superior performance over prior approaches.
Findings
ChatGPT outperforms previous ACMG methods by orders of magnitude.
Generated messages are generally accurate and high-quality.
Insights and categorization of failure cases are provided.
Abstract
Commit messages are explanations of changes made to a codebase that are stored in version control systems. They help developers understand the codebase as it evolves. However, writing commit messages can be tedious and inconsistent among developers. To address this issue, researchers have tried using different methods to automatically generate commit messages, including rule-based, retrieval-based, and learning-based approaches. Advances in large language models offer new possibilities for generating commit messages. In this study, we evaluate the performance of OpenAI's ChatGPT for generating commit messages based on code changes. We compare the results obtained with ChatGPT to previous automatic commit message generation methods that have been trained specifically on commit data. Our goal is to assess the extent to which large pre-trained language models can generate commit messages…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling
