CommitSuite: A Comprehensive Benchmark for Commit Classification and Message Generation

Zirui Wan; Zhaonan Wu; Xinyi Hou; Yanjie Zhao; Pengcheng Xia; Haoyu Wang

arXiv:2605.02256·cs.SE·May 5, 2026

CommitSuite: A Comprehensive Benchmark for Commit Classification and Message Generation

Zirui Wan, Zhaonan Wu, Xinyi Hou, Yanjie Zhao, Pengcheng Xia, Haoyu Wang

PDF

TL;DR

CommitSuite introduces a large-scale, CCS-compliant commit benchmark with semantic annotations and a novel reference-free evaluation framework, advancing research in commit classification and message generation.

Contribution

It provides the first extensive benchmark dataset with semantic annotations and a new evaluation method for commit message generation and classification.

Findings

01

LLMs support commit message generation effectively.

02

Evaluation framework achieves 0.849 Cohen's Kappa agreement.

03

CommitSuite enables reproducible research in commit understanding.

Abstract

High-quality commit messages are critical for maintaining software projects, yet ensuring their consistency and informativeness remains a practical challenge. While the Conventional Commits Specification (CCS) provides a structured format for commit messages, research on CCS-based commit classification and commit message generation (CMG) is limited by the absence of large-scale benchmarks, semantic annotations, and reliable evaluation methods. In this paper, we introduce CommitSuite, a benchmark comprising 63,533 CCS-compliant commits from 243 open-source repositories across seven programming languages. Each commit is labeled with its CCS type and enriched with AST-level code changes, along with LLM-assisted semantic annotations that capture the "what" and "why" behind the change. To evaluate CMG systems, we propose a reference-free framework based on five binary metrics: rationality,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.