Thinking in Uncertainty: Mitigating Hallucinations in MLRMs with Latent Entropy-Aware Decoding

Zhongxing Xu; Zhonghua Wang; Zhe Qian; Dachuan Shi; Feilong Tang; Ming Hu; Shiyan Su; Xiaocheng Zou; Wei Feng; Dwarikanath Mahapatra; Yifan Peng; Mingquan Lin; Zongyuan Ge

arXiv:2603.13366·cs.CV·March 17, 2026

Thinking in Uncertainty: Mitigating Hallucinations in MLRMs with Latent Entropy-Aware Decoding

Zhongxing Xu, Zhonghua Wang, Zhe Qian, Dachuan Shi, Feilong Tang, Ming Hu, Shiyan Su, Xiaocheng Zou, Wei Feng, Dwarikanath Mahapatra, Yifan Peng, Mingquan Lin, Zongyuan Ge

PDF

Open Access

TL;DR

This paper introduces LEAD, a decoding strategy that reduces hallucinations in multimodal large reasoning models by using entropy-aware mode switching and semantic-rich representations from token probability distributions.

Contribution

The paper proposes a novel entropy-aware decoding method, LEAD, that leverages latent superposed reasoning and semantic context to improve reasoning reliability in MLRMs.

Findings

01

LEAD significantly reduces hallucinations across multiple benchmarks.

02

The approach improves reasoning accuracy without sacrificing efficiency.

03

Semantic context extraction from token probabilities enhances model robustness.

Abstract

Recent advancements in multimodal large reasoning models (MLRMs) have significantly improved performance in visual question answering. However, we observe that transition words (e.g., because, however, and wait) are closely associated with hallucinations and tend to exhibit high-entropy states. We argue that adequate contextual reasoning information can be directly extracted from the token probability distribution. Inspired by superposed representation theory, we propose leveraging latent superposed reasoning to integrate multiple candidate semantics and maintain latent reasoning trajectories. The hypothesis is that reliance on discrete textual inputs may drive the model toward sequential explicit reasoning, underutilizing dense contextual cues during high-entropy reasoning stages. Therefore, we propose constructing rich semantic representations from the token probability distributions…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Advanced Graph Neural Networks · Topic Modeling