Mitigating Hallucinations in Multimodal Spatial Relations through   Constraint-Aware Prompting

Jiarui Wu; Zhuo Liu; Hangfeng He

arXiv:2502.08317·cs.CL·March 24, 2025

Mitigating Hallucinations in Multimodal Spatial Relations through Constraint-Aware Prompting

Jiarui Wu, Zhuo Liu, Hangfeng He

PDF

Open Access

TL;DR

This paper introduces a constraint-aware prompting framework that reduces spatial relation hallucinations in large vision-language models by enforcing bidirectional and transitivity constraints, leading to more coherent spatial predictions.

Contribution

The paper proposes a novel constraint-aware prompting method with bidirectional and transitivity constraints to mitigate spatial hallucinations in LVLMs.

Findings

01

Improved spatial relation accuracy on three datasets

02

Enhanced consistency in object relation predictions

03

Systematic analysis of constraint effectiveness

Abstract

Spatial relation hallucinations pose a persistent challenge in large vision-language models (LVLMs), leading to generate incorrect predictions about object positions and spatial configurations within an image. To address this issue, we propose a constraint-aware prompting framework designed to reduce spatial relation hallucinations. Specifically, we introduce two types of constraints: (1) bidirectional constraint, which ensures consistency in pairwise object relations, and (2) transitivity constraint, which enforces relational dependence across multiple objects. By incorporating these constraints, LVLMs can produce more spatially coherent and consistent outputs. We evaluate our method on three widely-used spatial relation datasets, demonstrating performance improvements over existing approaches. Additionally, a systematic analysis of various bidirectional relation analysis choices and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsData Visualization and Analytics · Psychiatry, Mental Health, Neuroscience