UnNatural Language Inference

Koustuv Sinha; Prasanna Parthasarathi; Joelle Pineau; Adina Williams

arXiv:2101.00010·cs.CL·June 14, 2021

UnNatural Language Inference

Koustuv Sinha, Prasanna Parthasarathi, Joelle Pineau, Adina Williams

PDF

1 Repo

TL;DR

This paper reveals that state-of-the-art NLI models are largely insensitive to word order, assigning the same labels to permuted sentences, which challenges assumptions about their understanding of syntax.

Contribution

The study provides empirical evidence that NLI models are invariant to word permutations, highlighting a significant limitation in their language understanding capabilities.

Findings

01

Models often assign the same label to permuted examples as to original sentences.

02

Almost 99% of MNLI examples have permutations that elicit the correct label.

03

The invariance issue is consistent across different model architectures and languages.

Abstract

Recent investigations into the inner-workings of state-of-the-art large-scale pre-trained Transformer-based Natural Language Understanding (NLU) models indicate that they appear to know humanlike syntax, at least to some extent. We provide novel evidence that complicates this claim: we find that state-of-the-art Natural Language Inference (NLI) models assign the same labels to permuted examples as they do to the original, i.e. they are largely invariant to random word-order permutations. This behavior notably differs from that of humans; we struggle with ungrammatical sentences. To measure the severity of this issue, we propose a suite of metrics and investigate which properties of particular permutations lead models to be word-order invariant. In the MNLI dataset, for example, we find almost all (98.7%) examples contain at least one permutation which elicits the gold label. Models are…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

facebookresearch/UNLU
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsLinear Layer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Weight Decay · Linear Warmup With Linear Decay · Label Smoothing · Multi-Head Attention · Attention Is All You Need · WordPiece · Attention Dropout