Do LLMs have core beliefs?

Anna Sokol; Marianna B. Ganapini; Nitesh V. Chawla

arXiv:2605.03255·cs.LG·May 6, 2026

Do LLMs have core beliefs?

Anna Sokol, Marianna B. Ganapini, Nitesh V. Chawla

PDF

TL;DR

This paper investigates whether large language models possess stable core beliefs akin to human cognition, finding that despite improvements, they still lack this fundamental aspect.

Contribution

It introduces a probing framework called Adversarial Dialogue Trees to assess the stability of LLMs' core commitments across multiple domains.

Findings

01

Most LLMs fail to maintain stable worldviews under pressure.

02

Recent models show improved argumentative stability but still lack core commitments.

03

All current models lack a key component of human cognition.

Abstract

The rise of Large Language Models (LLMs) has sparked debate about whether these systems exhibit human-level cognition. In this debate, little attention has been paid to a structural component of human cognition: core beliefs, truths that provide a foundation around which we can build a worldview. These commitments usually resist debunking, as abandoning them would represent a fundamental shift in how we see reality. In this paper, we ask whether LLMs hold anything akin to core commitments. Using a probing framework we call Adversarial Dialogue Trees (ADTs) over five domains (science, history, geography, biology, and mathematics), we find that most LLMs fail to maintain a stable worldview. Though some recent models showed improved stability, they still eventually failed to maintain key commitments under conversational pressure. These results document an improvement in argumentative…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.