Language models are not naysayers: An analysis of language models on   negation benchmarks

Thinh Hung Truong; Timothy Baldwin; Karin Verspoor; Trevor Cohn

arXiv:2306.08189·cs.CL·June 16, 2023·2 cites

Language models are not naysayers: An analysis of language models on negation benchmarks

Thinh Hung Truong, Timothy Baldwin, Karin Verspoor, Trevor Cohn

PDF

Open Access 1 Repo

TL;DR

This paper evaluates whether current large language models can effectively understand and handle negation, revealing significant limitations in their ability to process this fundamental linguistic feature.

Contribution

It provides a comprehensive analysis of LLMs' performance on negation benchmarks, highlighting their insensitivity and reasoning failures regarding negation.

Findings

01

LLMs show insensitivity to negation presence

02

They struggle with lexical semantics of negation

03

They fail to reason correctly under negation

Abstract

Negation has been shown to be a major bottleneck for masked language models, such as BERT. However, whether this finding still holds for larger-sized auto-regressive language models (``LLMs'') has not been studied comprehensively. With the ever-increasing volume of research and applications of LLMs, we take a step back to evaluate the ability of current-generation LLMs to handle negation, a fundamental linguistic phenomenon that is central to language understanding. We evaluate different LLMs -- including the open-source GPT-neo, GPT-3, and InstructGPT -- against a wide range of negation benchmarks. Through systematic experimentation with varying model sizes and prompts, we show that LLMs have several limitations including insensitivity to the presence of negation, an inability to capture the lexical semantics of negation, and a failure to reason under negation.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

joey234/llm-neg-bench
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Explainable Artificial Intelligence (XAI)

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · {Dispute@FaQ-s}How to file a dispute with Expedia? · 15 Ways to Contact How can i speak to someone at Delta Airlines · Attention Is All You Need · Linear Layer · Cosine Annealing · Layer Normalization · Multi-Head Attention · Weight Decay · Residual Connection