Towards Evaluting Fake Reasoning Bias in Language Models
Qian Wang, Zhenheng Tang, Zhanzhi Lou, Nuo Chen, Wenxuan Wang, and Bingsheng He

TL;DR
This paper introduces THEATER, a benchmark to evaluate Fake Reasoning Bias in language models, revealing their vulnerability to superficial cues and fabricated reasoning, with implications for improving model robustness.
Contribution
The paper presents THEATER, a systematic benchmark for assessing Fake Reasoning Bias in language models, and provides comprehensive analysis of model vulnerabilities and mitigation strategies.
Findings
Both LLMs and LRMs are vulnerable to FRB, with LLMs being more robust.
Simple Cues significantly reduce model accuracy, especially on subjective tasks.
Prompt-based mitigation improves factual task accuracy but less so on subjective tasks.
Abstract
Large Reasoning Models (LRMs), evolved from standard Large Language Models (LLMs), are increasingly utilized as automated judges because of their explicit reasoning processes. Yet we show that both LRMs and standard LLMs are vulnerable to Fake Reasoning Bias (FRB), where models favor the surface structure of reasoning even when the logic is flawed. To study this problem, we introduce THEATER, a comprehensive benchmark that systematically investigates FRB by manipulating reasoning structures to test whether language models are misled by superficial or fabricated cues. It covers two FRB types: (1) Simple Cues, minimal cues that resemble reasoning processes, and (2) Fake CoT, fabricated chains of thought that simulate multi-step reasoning. We evaluate 17 advanced LLMs and LRMs on both subjective DPO and factual datasets. Our results reveal four key findings: (1) Both LLMs and LRMs are…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Logic, Reasoning, and Knowledge
