Simple Linguistic Inferences of Large Language Models (LLMs): Blind Spots and Blinds
Victoria Basmov, Yoav Goldberg, Reut Tsarfaty

TL;DR
This paper evaluates large language models' ability to perform simple linguistic inference tasks, revealing significant blind spots where models struggle with entailments and the influence of syntactic structures, despite their advanced language understanding.
Contribution
It introduces targeted evaluation sets for simple inference tasks and demonstrates that LLMs have notable blind spots influenced by syntactic embedding and presupposition triggers.
Findings
Models show moderate to low performance on inference tasks.
Embedding in certain syntactic structures confuses models and affects entailment predictions.
Even strong LLMs have blind spots regarding specific entailment types.
Abstract
We evaluate LLMs' language understanding capacities on simple inference tasks that most humans find trivial. Specifically, we target (i) grammatically-specified entailments, (ii) premises with evidential adverbs of uncertainty, and (iii) monotonicity entailments. We design evaluation sets for these tasks and conduct experiments in both zero-shot and chain-of-thought setups, and with multiple prompts and LLMs. The models exhibit moderate to low performance on these evaluation sets. Subsequent experiments show that embedding the premise in syntactic constructions that should preserve the entailment relations (presupposition triggers) or change them (non-factives), further confuses the models, causing them to either under-predict or over-predict certain entailment labels regardless of the true relation, and often disregarding the nature of the embedding context. Overall these results…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Text Readability and Simplification · Explainable Artificial Intelligence (XAI)
