FINER: MLLMs Hallucinate under Fine-grained Negative Queries
Rui Xiao, Sanghwan Kim, Yongqin Xian, Zeynep Akata, Stephan Alaniz

TL;DR
This paper introduces FINER, a set of benchmarks and a tuning method to analyze and reduce hallucinations in multimodal large language models when handling fine-grained queries, improving their accuracy and robustness.
Contribution
The paper presents FINER benchmarks and a fine-tuning approach using DPO to significantly reduce hallucinations in MLLMs on fine-grained queries.
Findings
Finetuning with FINER-Tuning reduces hallucinations by up to 24.2%.
Benchmarks reveal hallucinations occur with fine-grained mismatches and present elements.
Finetuning improves performance across multiple existing hallucination benchmarks.
Abstract
Multimodal large language models (MLLMs) struggle with hallucinations, particularly with fine-grained queries, a challenge underrepresented by existing benchmarks that focus on coarse image-related questions. We introduce FIne-grained NEgative queRies (FINER), alongside two benchmarks: FINER-CompreCap and FINER-DOCCI. Using FINER, we analyze hallucinations across four settings: multi-object, multi-attribute, multi-relation, and ``what'' questions. Our benchmarks reveal that MLLMs hallucinate when fine-grained mismatches co-occur with genuinely present elements in the image. To address this, we propose FINER-Tuning, leveraging Direct Preference Optimization (DPO) on FINER-inspired data. Finetuning four frontier MLLMs with FINER-Tuning yields up to 24.2\% gains (InternVL3.5-14B) on hallucinations from our benchmarks, while simultaneously improving performance on eight existing…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Advanced Graph Neural Networks · Adversarial Robustness in Machine Learning
