Understanding and Evaluating Hallucinations in 3D Visual Language Models

Ruiying Peng; Kaiyuan Li; Weichen Zhang; Chen Gao; Xinlei Chen; Yong; Li

arXiv:2502.15888·cs.CV·February 25, 2025

Understanding and Evaluating Hallucinations in 3D Visual Language Models

Ruiying Peng, Kaiyuan Li, Weichen Zhang, Chen Gao, Xinlei Chen, Yong, Li

PDF

Open Access

TL;DR

This paper systematically studies hallucinations in 3D-LLMs, revealing their causes and proposing new metrics to evaluate and understand these inaccuracies in scene understanding models.

Contribution

It is the first comprehensive analysis of hallucinations in 3D-LLMs, identifying key causes and introducing novel evaluation metrics for these models.

Findings

01

All tested 3D-LLMs are significantly affected by hallucinations.

02

Main causes include dataset object frequency imbalance, object correlations, and limited attribute diversity.

03

Proposed new metrics effectively evaluate hallucination severity and model alignment.

Abstract

Recently, 3D-LLMs, which combine point-cloud encoders with large models, have been proposed to tackle complex tasks in embodied intelligence and scene understanding. In addition to showing promising results on 3D tasks, we found that they are significantly affected by hallucinations. For instance, they may generate objects that do not exist in the scene or produce incorrect relationships between objects. To investigate this issue, this work presents the first systematic study of hallucinations in 3D-LLMs. We begin by quickly evaluating hallucinations in several representative 3D-LLMs and reveal that they are all significantly affected by hallucinations. We then define hallucinations in 3D scenes and, through a detailed analysis of datasets, uncover the underlying causes of these hallucinations. We find three main causes: (1) Uneven frequency distribution of objects in the dataset. (2)…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsData Visualization and Analytics