Structured Disagreement in Health-Literacy Annotation: Epistemic Stability, Conceptual Difficulty, and Agreement-Stratified Inference
Olga Kellert, Sriya Kondury, Candice Koo, Nemika Tyagi, Steffen Eikenberry

TL;DR
This study analyzes health-literacy annotations from COVID-19 responses, showing that disagreement reflects meaningful task-related differences and varies across social factors, challenging traditional label aggregation methods.
Contribution
It demonstrates that disagreement in health-literacy annotation is structured by question difficulty and social factors, advocating for perspectivist models in interpretive NLP tasks.
Findings
Question-level difficulty explains more variance than annotator identity.
Social factors like country and education influence agreement levels.
Aggregating annotations can obscure important interpretive differences.
Abstract
Annotation pipelines in Natural Language Processing (NLP) commonly assume a single latent ground truth per instance and resolve disagreement through label aggregation. Perspectivist approaches challenge this view by treating disagreement as potentially informative rather than erroneous. We present a large-scale analysis of graded health-literacy annotations from 6,323 open-ended COVID-19 responses collected in Ecuador and Peru. Each response was independently labeled by multiple annotators using proportional correctness scores, reflecting the degree to which responses align with normative public-health guidelines, allowing us to analyze the full distribution of judgments rather than aggregated labels. Variance decomposition shows that question-level conceptual difficulty accounts for substantially more variance than annotator identity, indicating that disagreement is structured by the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
