Understanding and Predicting Human Label Variation in Natural Language Inference through Explanation
Nan-Jiang Jiang, Chenhao Tan, Marie-Catherine de Marneffe

TL;DR
This paper introduces LiveNLI, a new dataset with human explanations for natural language inference labels, aiming to improve models' ability to understand and predict human label variation and disagreement.
Contribution
The creation of LiveNLI, the first ecologically valid explanation dataset with diverse reasoning for NLP label variation, and its use to evaluate GPT-3's prediction capabilities.
Findings
GPT-3's in-context learning shows room for improvement in predicting label distributions.
LiveNLI provides a resource for understanding human annotation disagreement.
Explanations can enhance model robustness and interpretability.
Abstract
Human label variation (Plank 2022), or annotation disagreement, exists in many natural language processing (NLP) tasks. To be robust and trusted, NLP models need to identify such variation and be able to explain it. To this end, we created the first ecologically valid explanation dataset with diverse reasoning, LiveNLI. LiveNLI contains annotators' highlights and free-text explanations for the label(s) of their choice for 122 English Natural Language Inference items, each with at least 10 annotations. We used its explanations for chain-of-thought prompting, and found there is still room for improvement in GPT-3's ability to predict label distribution with in-context learning.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling
