Comparative reversal learning reveals rigid adaptation in LLMs under non-stationary uncertainty

Haomiaomiao Wang; Tom\'as E Ward; Lili Zhang

arXiv:2604.04182·cs.AI·April 7, 2026

Comparative reversal learning reveals rigid adaptation in LLMs under non-stationary uncertainty

Haomiaomiao Wang, Tom\'as E Ward, Lili Zhang

PDF

TL;DR

This study investigates how large language models adapt to changing environments using reversal learning tasks, revealing persistent rigidity and asymmetric evidence use compared to humans.

Contribution

It introduces a novel reversal-learning framework for LLMs, compares different models and schedules, and identifies mechanisms underlying rigid adaptation behaviors.

Findings

01

LLMs show high win-stay and low lose-shift behaviors.

02

Models exhibit persistent rigidity even under increased environmental volatility.

03

Hierarchical RL suggests multiple mechanisms for observed rigidity.

Abstract

Non-stationary environments require agents to revise previously learned action values when contingencies change. We treat large language models (LLMs) as sequential decision policies in a two-option probabilistic reversal-learning task with three latent states and switch events triggered by either a performance criterion or timeout. We compare a deterministic fixed transition cycle to a stochastic random schedule that increases volatility, and evaluate DeepSeek-V3.2, Gemini-3, and GPT-5.2, with human data as a behavioural reference. Across models, win-stay was near ceiling while lose-shift was markedly attenuated, revealing asymmetric use of positive versus negative evidence. DeepSeek-V3.2 showed extreme perseveration after reversals and weak acquisition, whereas Gemini-3 and GPT-5.2 adapted more rapidly but still remained less loss-sensitive than humans. Random transitions amplified…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.