The Facade of Truth: Uncovering and Mitigating LLM Susceptibility to Deceptive Evidence

Herun Wan; Jiaying Wu; Minnan Luo; Fanxiao Li; Zhi Zeng; and Min-Yen Kan

arXiv:2601.05478·cs.CL·January 12, 2026

The Facade of Truth: Uncovering and Mitigating LLM Susceptibility to Deceptive Evidence

Herun Wan, Jiaying Wu, Minnan Luo, Fanxiao Li, Zhi Zeng, and Min-Yen Kan

PDF

Open Access

TL;DR

This paper reveals a fundamental vulnerability of large language models to sophisticated deceptive evidence, introduces a framework to generate such evidence, and proposes a governance mechanism to mitigate this susceptibility.

Contribution

It introduces MisBelief, a framework for generating deceptive evidence against LLMs, and proposes Deceptive Intent Shielding (DIS) to mitigate model susceptibility.

Findings

01

Models are highly sensitive to refined deceptive evidence.

02

Belief scores in falsehoods increase by 93.0% on average.

03

DIS effectively mitigates belief shifts caused by deceptive evidence.

Abstract

To reliably assist human decision-making, LLMs must maintain factual internal beliefs against misleading injections. While current models resist explicit misinformation, we uncover a fundamental vulnerability to sophisticated, hard-to-falsify evidence. To systematically probe this weakness, we introduce MisBelief, a framework that generates misleading evidence via collaborative, multi-round interactions among multi-role LLMs. This process mimics subtle, defeasible reasoning and progressive refinement to create logically persuasive yet factually deceptive claims. Using MisBelief, we generate 4,800 instances across three difficulty levels to evaluate 7 representative LLMs. Results indicate that while models are robust to direct misinformation, they are highly sensitive to this refined evidence: belief scores in falsehoods increase by an average of 93.0\%, fundamentally compromising…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMisinformation and Its Impacts · Deception detection and forensic psychology · Ethics and Social Impacts of AI