Don't Think of the White Bear: Ironic Negation in Transformer Models Under Cognitive Load

Logan Mann; Nayan Saxena; Sarah Tandon; Chenhao Sun; Savar Toteja; Kevin Zhu

arXiv:2511.12381·cs.CL·November 18, 2025

Don't Think of the White Bear: Ironic Negation in Transformer Models Under Cognitive Load

Logan Mann, Nayan Saxena, Sarah Tandon, Chenhao Sun, Savar Toteja, Kevin Zhu

PDF

Open Access

TL;DR

This paper investigates how large language models experience ironic rebound when asked to suppress concepts, revealing that suppression often backfires especially with semantic distractors, and identifies neural mechanisms underlying this phenomenon.

Contribution

It introduces systematic experiments and a dataset to analyze ironic rebound in LLMs, linking cognitive phenomena with mechanistic neural insights.

Findings

01

Rebound occurs immediately after negation and worsens with semantic distractors.

02

Repetition of content supports better suppression.

03

Polarity separation predicts rebound persistence.

Abstract

Negation instructions such as 'do not mention $X$ ' can paradoxically increase the accessibility of $X$ in human thought, a phenomenon known as ironic rebound. Large language models (LLMs) face the same challenge: suppressing a concept requires internally activating it, which may prime rebound instead of avoidance. We investigated this tension with two experiments. \textbf{(1) Load \& content}: after a negation instruction, we vary distractor text (semantic, syntactic, repetition) and measure rebound strength. \textbf{(2) Polarity separation}: We test whether models distinguish neutral from negative framings of the same concept and whether this separation predicts rebound persistence. Results show that rebound consistently arises immediately after negation and intensifies with longer or semantic distractors, while repetition supports suppression. Stronger polarity separation correlates…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeurobiology of Language and Bilingualism · Mind wandering and attention · Action Observation and Synchronization