VORD: Visual Ordinal Calibration for Mitigating Object Hallucinations in   Large Vision-Language Models

Dexter Neo; Tsuhan Chen

arXiv:2412.15739·cs.CV·December 23, 2024

VORD: Visual Ordinal Calibration for Mitigating Object Hallucinations in Large Vision-Language Models

Dexter Neo, Tsuhan Chen

PDF

Open Access

TL;DR

VORD is a novel calibration method for large vision-language models that reduces object hallucinations by leveraging ordinal relationships between image pairs, improving accuracy without extensive training.

Contribution

The paper introduces VORD, a simple, training-free and trainable calibration approach that effectively mitigates hallucinations in LVLMs by using ordinal relationships.

Findings

01

VORD improves calibration accuracy in LVLMs.

02

VORD reduces object hallucinations across benchmarks.

03

The method is effective with minimal or no training.

Abstract

Large Vision-Language Models (LVLMs) have made remarkable developments along with the recent surge of large language models. Despite their advancements, LVLMs have a tendency to generate plausible yet inaccurate or inconsistent information based on the provided source content. This phenomenon, also known as ``hallucinations" can have serious downstream implications during the deployment of LVLMs. To address this, we present VORD a simple and effective method that alleviates hallucinations by calibrating token predictions based on ordinal relationships between modified image pairs. VORD is presented in two forms: 1.) a minimalist training-free variant which eliminates implausible tokens from modified image pairs, and 2.) a trainable objective function that penalizes unlikely tokens. Our experiments demonstrate that VORD delivers better calibration and effectively mitigates object…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsFunctional Brain Connectivity Studies · Epilepsy research and treatment · Cell Image Analysis Techniques