Uncovering More Shallow Heuristics: Probing the Natural Language   Inference Capacities of Transformer-Based Pre-Trained Language Models Using   Syllogistic Patterns

Reto Gubelmann; Siegfried Handschuh

arXiv:2201.07614·cs.CL·January 20, 2022·1 cites

Uncovering More Shallow Heuristics: Probing the Natural Language Inference Capacities of Transformer-Based Pre-Trained Language Models Using Syllogistic Patterns

Reto Gubelmann, Siegfried Handschuh

PDF

Open Access

TL;DR

This paper investigates how transformer-based pre-trained language models for natural language inference rely on superficial heuristics, revealing their limited true understanding and highlighting issues with generalization.

Contribution

The study introduces a syllogistic-based dataset to evaluate PLMs, demonstrating their dependence on shallow heuristics rather than genuine reasoning capabilities.

Findings

01

Models rely on symmetries and asymmetries in premise and hypothesis

02

Lack of generalization indicates reliance on spurious heuristics

03

PLMs do not truly learn NLI but exploit superficial patterns

Abstract

In this article, we explore the shallow heuristics used by transformer-based pre-trained language models (PLMs) that are fine-tuned for natural language inference (NLI). To do so, we construct or own dataset based on syllogistic, and we evaluate a number of models' performance on our dataset. We find evidence that the models rely heavily on certain shallow heuristics, picking up on symmetries and asymmetries between premise and hypothesis. We suggest that the lack of generalization observable in our study, which is becoming a topic of lively debate in the field, means that the PLMs are currently not learning NLI, but rather spurious heuristics.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications