Designing NLP Systems That Adapt to Diverse Worldviews
Claudiu Creanga, Liviu P. Dinu

TL;DR
This paper advocates for incorporating diverse worldviews into NLP datasets to better capture subjective meaning, demonstrating initial improvements in model performance through annotator metadata.
Contribution
It introduces a perspectivist approach to NLP dataset design, explicitly modeling annotator demographics and values to address subjectivity in language understanding.
Findings
Annotator metadata improves model performance.
Diverse worldviews can be captured in datasets.
Initial experiments show promising results.
Abstract
Natural Language Inference (NLI) is foundational for evaluating language understanding in AI. However, progress has plateaued, with models failing on ambiguous examples and exhibiting poor generalization. We argue that this stems from disregarding the subjective nature of meaning, which is intrinsically tied to an individual's \textit{weltanschauung} (which roughly translates to worldview). Existing NLP datasets often obscure this by aggregating labels or filtering out disagreement. We propose a perspectivist approach: building datasets that capture annotator demographics, values, and justifications for their labels. Such datasets would explicitly model diverse worldviews. Our initial experiments with a subset of the SBIC dataset demonstrate that even limited annotator metadata can improve model performance.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Advanced Text Analysis Techniques · Topic Modeling
