Mitigating Object Hallucinations via Sentence-Level Early Intervention
Shangpin Peng, Senqiao Yang, Li Jiang, Zhuotao Tian

TL;DR
This paper introduces SENTINEL, a novel framework that significantly reduces hallucinations in multimodal large language models by focusing on early sentence-level intervention and preference learning without human annotations.
Contribution
SENTINEL is the first method to target early-stage hallucination propagation using in-domain preference learning and context-aware training, improving hallucination mitigation without extra human labels.
Findings
Reduces hallucinations by over 90% compared to baseline
Outperforms previous state-of-the-art on hallucination benchmarks
Enhances generalization and robustness of multimodal models
Abstract
Multimodal large language models (MLLMs) have revolutionized cross-modal understanding but continue to struggle with hallucinations - fabricated content contradicting visual inputs. Existing hallucination mitigation methods either incur prohibitive computational costs or introduce distribution mismatches between training data and model outputs. We identify a critical insight: hallucinations predominantly emerge at the early stages of text generation and propagate through subsequent outputs. To address this, we propose SENTINEL (Sentence-level Early iNtervention Through IN-domain prEference Learning), a framework that eliminates dependency on human annotations. Specifically, we first bootstrap high-quality in-domain preference pairs by iteratively sampling model outputs, validating object existence through cross-checking with two open-vocabulary detectors, and classifying sentences into…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
- 🤗psp-dada/LLaVA-v1.5-7B-SENTINELmodel· 47 dl· ♡ 147 dl♡ 1
- 🤗psp-dada/LLaVA-v1.5-13B-SENTINELmodel· 9 dl· ♡ 19 dl♡ 1
- 🤗psp-dada/LLaVA-v1.6-Vicuna-13B-SENTINELmodel· ♡ 1♡ 1
- 🤗psp-dada/LLaVA-v1.6-Vicuna-7B-SENTINELmodel· ♡ 1♡ 1
- 🤗psp-dada/Qwen2.5-VL-7B-Instruct-SENTINELmodel· ♡ 1♡ 1
- 🤗psp-dada/Qwen2-VL-7B-Instruct-SENTINELmodel· ♡ 1♡ 1
- 🤗psp-dada/Qwen2-VL-2B-Instruct-SENTINELmodel· ♡ 1♡ 1
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
