Stress-Testing Neural Models of Natural Language Inference with Multiply-Quantified Sentences
Atticus Geiger, Ignacio Cases, Lauri Karttunen, and Christopher Potts

TL;DR
This paper introduces a method to generate complex, multiply-quantified natural language inference datasets to evaluate neural models' semantic understanding, revealing that most models fail to encode crucial information unless forced lexical alignments are used.
Contribution
The paper presents a novel data generation approach for complex NLI examples and demonstrates that standard models often lose essential semantic information, highlighting the need for specialized architectures.
Findings
Most models fail to encode crucial semantic information.
Forced lexical alignments prevent information loss.
Standard architectures are insufficient for complex NLI tasks.
Abstract
Standard evaluations of deep learning models for semantics using naturalistic corpora are limited in what they can tell us about the fidelity of the learned representations, because the corpora rarely come with good measures of semantic complexity. To overcome this limitation, we present a method for generating data sets of multiply-quantified natural language inference (NLI) examples in which semantic complexity can be precisely characterized, and we use this method to show that a variety of common architectures for NLI inevitably fail to encode crucial information; only a model with forced lexical alignments avoids this damaging information loss.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Neural Networks and Applications
