Vulnerability of LLMs' Stated Beliefs? LLMs Belief Resistance Check Through Strategic Persuasive Conversation Interventions

Fan Huang; Haewoon Kwak; Jisun An

arXiv:2601.13590·cs.CL·March 20, 2026

Vulnerability of LLMs' Stated Beliefs? LLMs Belief Resistance Check Through Strategic Persuasive Conversation Interventions

Fan Huang, Haewoon Kwak, Jisun An

PDF

Open Access

TL;DR

This paper systematically evaluates the susceptibility of various LLMs to persuasive strategies and finds that model size, prompting techniques, and fine-tuning significantly influence belief resistance, revealing current robustness limitations.

Contribution

It provides a comprehensive analysis of LLM belief resistance across models, domains, and interventions, highlighting the variability and limitations of current robustness methods.

Findings

01

Smaller models are more susceptible to persuasion, with rapid belief change.

02

Verbalized confidence prompts increase vulnerability rather than improve robustness.

03

Adversarial fine-tuning can significantly enhance belief resistance in some models.

Abstract

Large Language Models (LLMs) are increasingly employed in various question-answering tasks. However, recent studies showcase that LLMs are susceptible to persuasion and could adopt counterfactual beliefs. We present a systematic evaluation of LLM susceptibility to persuasion under the \emph{Source--Message--Channel--Receiver} (SMCR) communication framework. Across six mainstream Large Language Models (LLMs) and three domains (factual knowledge, medical QA, and social bias), we analyze how different persuasive strategies influence stated belief stability over multiple interaction turns. We further examine whether verbalized confidence prompting (i.e., eliciting self-reported confidence scores) affects resistance to persuasion. Results show that the smallest model (Llama 3.2-3B) exhibits extreme compliance, with 82.5\% of belief changes occurring at the first persuasive turn (average end…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMisinformation and Its Impacts · Topic Modeling · Artificial Intelligence in Healthcare and Education