Enhancing LLM-Based Data Annotation with Error Decomposition

Zhen Xu; Vedant Khatri; Yijun Dai; Xiner Liu; Siyan Li; Xuanming Zhang; Renzhe Yu

arXiv:2601.11920·cs.CL·January 21, 2026

Enhancing LLM-Based Data Annotation with Error Decomposition

Zhen Xu, Vedant Khatri, Yijun Dai, Xiner Liu, Siyan Li, Xuanming Zhang, Renzhe Yu

PDF

Open Access

TL;DR

This paper introduces a diagnostic evaluation framework for LLM-based data annotation that distinguishes between different error sources and assesses their impact on downstream tasks, especially in subjective annotation contexts.

Contribution

It proposes a taxonomy and a lightweight human-in-the-loop method to decompose and diagnose LLM annotation errors, improving understanding of annotation quality and task suitability.

Findings

01

Validated on four educational annotation tasks.

02

Demonstrated the paradigm's ability to distinguish error types.

03

Provided insights into the limitations of high alignment scores.

Abstract

Large language models offer a scalable alternative to human coding for data annotation tasks, enabling the scale-up of research across data-intensive domains. While LLMs are already achieving near-human accuracy on objective annotation tasks, their performance on subjective annotation tasks, such as those involving psychological constructs, is less consistent and more prone to errors. Standard evaluation practices typically collapse all annotation errors into a single alignment metric, but this simplified approach may obscure different kinds of errors that affect final analytical conclusions in different ways. Here, we propose a diagnostic evaluation paradigm that incorporates a human-in-the-loop step to separate task-inherent ambiguity from model-driven inaccuracies and assess annotation quality in terms of their potential downstream impacts. We refine this paradigm on ordinal…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Text Readability and Simplification