Automated Classification of Human Code Review Comments with Large Language Models

Semih \c{C}a\u{g}lar; \c{S}\"ukr\"u Eren G\"ok{\i}rmak; Eray T\"uz\"un

arXiv:2604.23667·cs.SE·April 28, 2026

Automated Classification of Human Code Review Comments with Large Language Models

Semih \c{C}a\u{g}lar, \c{S}\"ukr\"u Eren G\"ok{\i}rmak, Eray T\"uz\"un

PDF

TL;DR

This paper develops and evaluates an automated system using large language models to classify code review comments into specific issue categories, addressing issues like redundancy and vagueness.

Contribution

It introduces a nine-label taxonomy for classifying review comments and benchmarks zero-shot and one-shot LLM performance on this task.

Findings

01

Zero-shot macro-F1 scores ranged from 0.360 to 0.374.

02

One-shot exemplars improved performance for some models and labels.

03

Classification of evidence-sensitive labels remains challenging.

Abstract

Context: Code reviews are essential for maintaining software quality, yet many human review comments suffer from issues such as redundancy, vagueness, or lack of constructiveness. These types of comments may slow down feedback and obscure important insights. Prior work on code review comments mostly explore the detection and categorization of useful comments, while fine-grained categorization of comment issues remains underexplored. Objective: This work aims to design and evaluate an automated system for classifying code review comments according to specific categories of issues. Methodology: We introduced a nine-label taxonomy for code review comments, covering six review comment smells and three common useful intents, and manually labeled 448 comments from a publicly available dataset. We benchmarked zero-shot and one-shot single-label classification over each comment and its…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.