Negation-Induced Forgetting in LLMs
Francesca Capuano, Ellen Boschert, Barbara Kaup

TL;DR
This paper investigates whether Large Language Models exhibit negation-induced forgetting, a cognitive bias where negating incorrect information reduces recall, revealing that some models like ChatGPT-3.5 do show this effect.
Contribution
It adapts human cognitive experiments to test for negation-induced forgetting in LLMs, providing initial evidence of this bias in some models.
Findings
ChatGPT-3.5 exhibits negation-induced forgetting.
GPT-4o-mini shows marginal NIF effect.
LLaMA-3-70B does not exhibit NIF.
Abstract
The study explores whether Large Language Models (LLMs) exhibit negation-induced forgetting (NIF), a cognitive phenomenon observed in humans where negating incorrect attributes of an object or event leads to diminished recall of this object or event compared to affirming correct attributes (Mayo et al., 2014; Zang et al., 2023). We adapted Zang et al. (2023) experimental framework to test this effect in ChatGPT-3.5, GPT-4o mini and Llama3-70b-instruct. Our results show that ChatGPT-3.5 exhibits NIF, with negated information being less likely to be recalled than affirmed information. GPT-4o-mini showed a marginally significant NIF effect, while LLaMA-3-70B did not exhibit NIF. The findings provide initial evidence of negation-induced forgetting in some LLMs, suggesting that similar cognitive biases may emerge in these models. This work is a preliminary step in understanding how…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMemory Processes and Influences · Artificial Intelligence in Healthcare and Education · Neurobiology of Language and Bilingualism
