OPERA: Alleviating Hallucination in Multi-Modal Large Language Models via Over-Trust Penalty and Retrospection-Allocation
Qidong Huang, Xiaoyi Dong, Pan Zhang, Bin Wang, Conghui He, Jiaqi, Wang, Dahua Lin, Weiming Zhang, Nenghai Yu

TL;DR
OPERA is a decoding strategy for multi-modal large language models that reduces hallucinations by penalizing over-trust in certain tokens and reallocating token choices based on retrospection, without extra training or data.
Contribution
It introduces a novel decoding method combining an over-trust penalty and retrospection-allocation to mitigate hallucinations in MLLMs without additional data or training.
Findings
Significantly reduces hallucinations across various MLLMs.
Effective without additional training or external knowledge.
Proven to be generalizable and efficient.
Abstract
Hallucination, posed as a pervasive challenge of multi-modal large language models (MLLMs), has significantly impeded their real-world usage that demands precise judgment. Existing methods mitigate this issue with either training with specific designed data or inferencing with external knowledge from other sources, incurring inevitable additional costs. In this paper, we present OPERA, a novel MLLM decoding method grounded in an Over-trust Penalty and a Retrospection-Allocation strategy, serving as a nearly free lunch to alleviate the hallucination issue without additional data, knowledge, or training. Our approach begins with an interesting observation that, most hallucinations are closely tied to the knowledge aggregation patterns manifested in the self-attention matrix, i.e., MLLMs tend to generate new tokens by focusing on a few summary tokens, but not all the previous tokens. Such…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBig Data and Digital Economy · Machine Learning in Healthcare · Advanced Graph Neural Networks
