Capturing Label Distribution: A Case Study in NLI

Shujian Zhang; Chengyue Gong; Eunsol Choi

arXiv:2102.06859·cs.CL·February 16, 2021·5 cites

Capturing Label Distribution: A Case Study in NLI

Shujian Zhang, Chengyue Gong, Eunsol Choi

PDF

Open Access

TL;DR

This paper investigates methods to better estimate human disagreement in natural language inference, showing that post-hoc smoothing and collecting multiple references improve label distribution modeling.

Contribution

It introduces a novel approach of collecting multiple references during training and compares it with post-hoc smoothing for estimating label distributions in NLI.

Findings

01

Post-hoc smoothing reduces KL divergence by nearly half.

02

Collecting multiple references improves accuracy within fixed annotation budgets.

03

Simple smoothing does not enhance majority label prediction accuracy.

Abstract

We study estimating inherent human disagreement (annotation label distribution) in natural language inference task. Post-hoc smoothing of the predicted label distribution to match the expected label entropy is very effective. Such simple manipulation can reduce KL divergence by almost half, yet will not improve majority label prediction accuracy or learn label distributions. To this end, we introduce a small amount of examples with multiple references into training. We depart from the standard practice of collecting a single reference per each training example, and find that collecting multiple references can achieve better accuracy under the fixed annotation budget. Lastly, we provide rich analyses comparing these two methods for improving label distribution estimation.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Speech and dialogue systems