When Slower Isn't Truer: Inverse Scaling Law of Truthfulness in Multimodal Reasoning

Sitong Fang; Wenjing Cao; Jiahao Li; Xuyao Wang; Juntao Dai; Chi-Min Chan; Sirui Han; Yike Guo; Yaodong Yang; Jiaming Ji

arXiv:2505.20214·cs.AI·April 17, 2026

When Slower Isn't Truer: Inverse Scaling Law of Truthfulness in Multimodal Reasoning

Sitong Fang, Wenjing Cao, Jiahao Li, Xuyao Wang, Juntao Dai, Chi-Min Chan, Sirui Han, Yike Guo, Yaodong Yang, Jiaming Ji

PDF

TL;DR

This paper investigates the inverse scaling law of truthfulness in multimodal reasoning models, revealing that slower, more deliberate reasoning can lead to increased falsehoods under ambiguous visual inputs.

Contribution

It presents the first systematic study of how slow reasoning models may produce more falsehoods, highlighting their vulnerability to flawed premises in multimodal tasks.

Findings

01

Slower reasoning models tend to fabricate false details with misleading inputs.

02

Faster chat models exhibit more cautious, breadth-first inference.

03

DFS-style reasoning becomes fragile with ambiguous, multimodal data.

Abstract

Reasoning models have attracted increasing attention for their ability to tackle complex tasks, embodying the System II (slow thinking) paradigm in contrast to System I (fast, intuitive responses). Yet a key question remains: Does slower reasoning necessarily lead to more truthful answers? Our findings suggest otherwise. We conduct the first systematic study of the inverse scaling law in slow-thinking paradigms for multimodal reasoning. We find that when confronted with incomplete or misleading visual inputs, slow-thinking models are more prone to fabricating plausible yet false details to justify untruthful reasoning. To analyze this behavior, we construct a 5,000-sample hierarchical prompt dataset annotated by 50 human participants. The prompts progressively increase in complexity, revealing a consistent pattern: slower reasoning models tend to follow depth-first search (DFS)…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.