What Makes a Good Commit Message?
Yingchen Tian, Yuxia Zhang, Klaas-Jan Stol, Lin Jiang, and Hui Liu

TL;DR
This study defines criteria for good commit messages, analyzes their quality in open source projects, and explores automatic identification methods to improve developer communication and project documentation.
Contribution
It introduces a taxonomy of commit message patterns and assesses the prevalence of poorly written messages, highlighting data quality issues in automated message generation.
Findings
44% of commit messages could be improved
Uncurated datasets pose a threat to message generation quality
Little guidance exists for writing effective commit messages
Abstract
A key issue in collaborative software development is communication among developers. One modality of communication is a commit message, in which developers describe the changes they make in a repository. As such, commit messages serve as an "audit trail" by which developers can understand how the source code of a project has changed-and why. Hence, the quality of commit messages affects the effectiveness of communication among developers. Commit messages are often of poor quality as developers lack time and motivation to craft a good message. Several automatic approaches have been proposed to generate commit messages. However, these are based on uncurated datasets including considerable proportions of poorly phrased commit messages. In this multi-method study, we first define what constitutes a "good" commit message, and then establish what proportion of commit messages lack information…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
