Inter-Coder Agreement for Nominal Scales: A Model-based Approach
Dirk Schuster

TL;DR
This paper introduces a model-based approach to inter-coder agreement for nominal scales, aiming to clarify reliability measures and address issues like bias and prevalence effects through a formal axiomatic framework.
Contribution
It proposes a formal model for inter-coder reliability, defines a new reliability measure as a probability, and provides an algorithm with simulations to evaluate its accuracy.
Findings
The model clarifies conditions for unique reliability determination.
The algorithm accurately estimates reliability under various settings.
Simulations demonstrate the effectiveness of the proposed approach.
Abstract
Inter-coder agreement measures, like Cohen's kappa, correct the relative frequency of agreement between coders to account for agreement which simply occurs by chance. However, in some situations these measures exhibit behavior which make their values difficult to interprete. These properties, e.g. the "annotator bias" or the "problem of prevalence", refer to a tendency of some of these measures to indicate counterintuitive high or low values of reliability depending on conditions which many researchers consider as unrelated to inter-coder reliability. However, not all researchers agree with this view, and since there is no commonly accepted formal definition of inter-coder reliability, it is hard to decide whether this depends upon a different concept of reliability or simply upon flaws in the measuring algorithms. In this note we therefore take an axiomatic approach: we introduce a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReliability and Agreement in Measurement · Meta-analysis and systematic reviews · Hemodynamic Monitoring and Therapy
