DiPietro-Hazari Kappa: A Novel Metric for Assessing Labeling Quality via Annotation
Daniel M. DiPietro, Vivek Hazari

TL;DR
This paper introduces DiPietro-Hazari Kappa, a new statistical metric based on Fleiss's Kappa, designed to evaluate the quality of dataset labels in human annotation tasks, with theoretical and computational insights.
Contribution
The paper presents a novel metric, DiPietro-Hazari Kappa, that extends Fleiss's Kappa to better assess labeling quality in datasets, including theoretical foundations and implementation guidance.
Findings
The metric quantifies annotator agreement above random chance.
Theoretical analysis of Fleiss's Kappa informs the new metric.
Provides a matrix formulation and procedural instructions for computation.
Abstract
Data is a key component of modern machine learning, but statistics for assessing data label quality remain sparse in literature. Here, we introduce DiPietro-Hazari Kappa, a novel statistical metric for assessing the quality of suggested dataset labels in the context of human annotation. Rooted in the classical Fleiss's Kappa measure of inter-annotator agreement, the DiPietro-Hazari Kappa quantifies the the empirical annotator agreement differential that was attained above random chance. We offer a thorough theoretical examination of Fleiss's Kappa before turning to our derivation of DiPietro-Hazari Kappa. Finally, we conclude with a matrix formulation and set of procedural instructions for easy computational implementation.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMulti-Criteria Decision Making · Reliability and Agreement in Measurement · Sensory Analysis and Statistical Methods
