Investigating Biases in Textual Entailment Datasets

Shawn Tan; Yikang Shen; Chin-wei Huang; Aaron Courville

arXiv:1906.09635·cs.CL·June 25, 2019·5 cites

Investigating Biases in Textual Entailment Datasets

Shawn Tan, Yikang Shen, Chin-wei Huang, Aaron Courville

PDF

Open Access

TL;DR

This paper examines biases in textual entailment datasets like SNLI and MultiNLI, analyzing their impact on model performance and proposing methods to mitigate these biases for more reliable language understanding evaluation.

Contribution

It provides a detailed analysis of dataset biases in textual entailment and introduces a simple approach to reduce these biases, improving dataset quality.

Findings

01

Classifying hypotheses alone achieves 64% accuracy on SNLI.

02

Biases significantly influence model performance.

03

Proposed bias reduction method decreases dataset biases.

Abstract

The ability to understand logical relationships between sentences is an important task in language understanding. To aid in progress for this task, researchers have collected datasets for machine learning and evaluation of current systems. However, like in the crowdsourced Visual Question Answering (VQA) task, some biases in the data inevitably occur. In our experiments, we find that performing classification on just the hypotheses on the SNLI dataset yields an accuracy of 64%. We analyze the bias extent in the SNLI and the MultiNLI dataset, discuss its implication, and propose a simple method to reduce the biases in the datasets.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning · Topic Modeling