NUTMEG: Separating Signal From Noise in Annotator Disagreement
Jonathan Ivey, Susan Gauch, and David Jurgens

TL;DR
NUTMEG is a Bayesian model that effectively distinguishes between genuine systematic disagreements and noisy annotations in crowdsourced NLP data, improving ground-truth recovery and downstream model performance.
Contribution
It introduces NUTMEG, a novel Bayesian approach that incorporates annotator backgrounds to separate signal from noise in disagreement data.
Findings
NUTMEG outperforms traditional aggregation in recovering ground-truth.
Models trained on NUTMEG-processed data perform better.
Systematic disagreements can be preserved while noisy annotations are removed.
Abstract
NLP models often rely on human-labeled data for training and evaluation. Many approaches crowdsource this data from a large number of annotators with varying skills, backgrounds, and motivations, resulting in conflicting annotations. These conflicts have traditionally been resolved by aggregation methods that assume disagreements are errors. Recent work has argued that for many tasks annotators may have genuine disagreements and that variation should be treated as signal rather than noise. However, few models separate signal and noise in annotator disagreement. In this work, we introduce NUTMEG, a new Bayesian model that incorporates information about annotator backgrounds to remove noisy annotations from human-labeled training data while preserving systematic disagreements. Using synthetic data, we show that NUTMEG is more effective at recovering ground-truth from annotations with…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsMobile Crowdsensing and Crowdsourcing · Ethics and Social Impacts of AI · Topic Modeling
