Language models are not naysayers: An analysis of language models on negation benchmarks
Thinh Hung Truong, Timothy Baldwin, Karin Verspoor, Trevor Cohn

TL;DR
This paper evaluates whether current large language models can effectively understand and handle negation, revealing significant limitations in their ability to process this fundamental linguistic feature.
Contribution
It provides a comprehensive analysis of LLMs' performance on negation benchmarks, highlighting their insensitivity and reasoning failures regarding negation.
Findings
LLMs show insensitivity to negation presence
They struggle with lexical semantics of negation
They fail to reason correctly under negation
Abstract
Negation has been shown to be a major bottleneck for masked language models, such as BERT. However, whether this finding still holds for larger-sized auto-regressive language models (``LLMs'') has not been studied comprehensively. With the ever-increasing volume of research and applications of LLMs, we take a step back to evaluate the ability of current-generation LLMs to handle negation, a fundamental linguistic phenomenon that is central to language understanding. We evaluate different LLMs -- including the open-source GPT-neo, GPT-3, and InstructGPT -- against a wide range of negation benchmarks. Through systematic experimentation with varying model sizes and prompts, we show that LLMs have several limitations including insensitivity to the presence of negation, an inability to capture the lexical semantics of negation, and a failure to reason under negation.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Explainable Artificial Intelligence (XAI)
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · {Dispute@FaQ-s}How to file a dispute with Expedia? · 15 Ways to Contact How can i speak to someone at Delta Airlines · Attention Is All You Need · Linear Layer · Cosine Annealing · Layer Normalization · Multi-Head Attention · Weight Decay · Residual Connection
