Can NLP Models Correctly Reason Over Contexts that Break the Common Assumptions?
Neeraj Varshney, Mihir Parmar, Nisarg Patel, Divij Handa, Sayantan, Sarkar, Man Luo, Chitta Baral

TL;DR
This paper evaluates whether state-of-the-art NLP models can accurately reason over scenarios that violate common assumptions, revealing significant performance gaps and highlighting the need for more robust reasoning capabilities.
Contribution
The study systematically creates evaluation data for reasoning over assumption-breaking contexts and analyzes model performance, exposing current limitations and guiding future improvements.
Findings
Models perform well on assumption-following contexts.
Models struggle with assumption-breaking contexts, with up to 20% performance gap.
Analysis reveals key challenges in reasoning over atypical scenarios.
Abstract
Pre-training on large corpora of text enables the language models to acquire a vast amount of factual and commonsense knowledge which allows them to achieve remarkable performance on a variety of language understanding tasks. They typically acquire this knowledge by learning from the pre-training text and capturing certain patterns from it. However, real-world settings often present scenarios that do not abide by these patterns i.e. scenarios that break the common assumptions. Can state-of-the-art NLP models correctly reason over the contexts of such scenarios? Addressing the above question, in this paper, we investigate the ability of models to correctly reason over contexts that break the common assumptions. To this end, we first systematically create evaluation data in which each data instance consists of (a) a common assumption, (b) a context that follows the assumption, (c) a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Text Readability and Simplification
MethodsGated Linear Unit · Refunds@Expedia|||How do I get a full refund from Expedia? · {Dispute@FaQ-s}How to file a dispute with Expedia? · 15 Ways to Contact How can i speak to someone at Delta Airlines · Attention Is All You Need · Adafactor · Cosine Annealing · Softmax · Layer Normalization · Inverse Square Root Schedule
