COPO: Causal-Oriented Policy Optimization for Hallucinations of MLLMs
Peizheng Guo, Jingyao Wang, Wenwen Qiang, Jiahuan Zhou, Changwen Zheng, Gang Hua

TL;DR
This paper introduces COPO, a causal-oriented policy optimization method that reduces hallucinations in multimodal large language models by focusing on causally relevant tokens, leading to more accurate and grounded outputs.
Contribution
The paper proposes a novel causal-oriented optimization framework that explicitly constrains token relevance to mitigate hallucinations in MLLMs, a significant advancement over existing methods.
Findings
COPO effectively reduces hallucinations in MLLMs.
Experimental results show improved factual accuracy and grounding.
The method outperforms baseline models on multiple benchmarks.
Abstract
Despite Multimodal Large Language Models (MLLMs) having shown impressive capabilities, they may suffer from hallucinations. Empirically, we find that MLLMs attend disproportionately to task-irrelevant background regions compared with text-only LLMs, implying spurious background-answer correlations. We claim and analyze that (i) outcome-based rewards can be an important factor leading to spurious correlations, and (ii) spurious correlations can be an important factor leading to hallucinations. Based on these results, we propose Causal-Oriented Policy Optimization (COPO) to mitigate these spurious correlations, thus addressing the issue of hallucinations. It imposes token-level sufficiency and necessity constraints to measure each inference token's causal contribution, thus ensuring correct and evidence-grounded output. Specifically, we first evaluate each token's causal contribution via…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Mental Health via Writing · Machine Learning in Healthcare
