IRIS: Implicit Reward-Guided Internal Sifting for Mitigating Multimodal Hallucination

Yuanshuai Li; Yuping Yan; Jirui Han; Fei Ming; Lingjuan Lv; Yaochu Jin

arXiv:2602.01769·cs.LG·February 4, 2026

IRIS: Implicit Reward-Guided Internal Sifting for Mitigating Multimodal Hallucination

Yuanshuai Li, Yuping Yan, Jirui Han, Fei Ming, Lingjuan Lv, Yaochu Jin

PDF

Open Access

TL;DR

IRIS introduces a novel on-policy method leveraging implicit rewards within the model's native space to effectively reduce hallucinations in multimodal large language models without external evaluators.

Contribution

The paper proposes IRIS, a new implicit reward-guided internal sifting approach that directly addresses modal conflicts and hallucinations in MLLMs using internal signals and self-generated preferences.

Findings

01

IRIS achieves competitive hallucination mitigation performance.

02

Uses only 5.7k samples without external feedback.

03

Effectively captures internal modal conflicts.

Abstract

Hallucination remains a fundamental challenge for Multimodal Large Language Models (MLLMs). While Direct Preference Optimization (DPO) is a key alignment framework, existing approaches often rely heavily on costly external evaluators for scoring or rewriting, incurring off-policy learnability gaps and discretization loss. Due to the lack of access to internal states, such feedback overlooks the fine-grained conflicts between different modalities that lead to hallucinations during generation. To address this issue, we propose IRIS (Implicit Reward-Guided Internal Sifting), which leverages continuous implicit rewards in the native log-probability space to preserve full information density and capture internal modal competition. This on-policy paradigm eliminates learnability gaps by utilizing self-generated preference pairs. By sifting these pairs based on multimodal implicit rewards,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Multimodal Machine Learning Applications · Mental Health via Writing