3D-GRAND: A Million-Scale Dataset for 3D-LLMs with Better Grounding and Less Hallucination
Jianing Yang, Xuweiyi Chen, Nikhil Madaan, Madhavan Iyengar, Shengyi, Qian, David F. Fouhey, Joyce Chai

TL;DR
The paper introduces 3D-GRAND, a large-scale dataset with dense language-grounded 3D scenes, and a benchmark 3D-POPE to improve grounding and reduce hallucinations in 3D language models, advancing embodied AI.
Contribution
It provides a pioneering large-scale dataset and a systematic benchmark for evaluating and enhancing 3D-LLMs, addressing key challenges in grounding and hallucination reduction.
Findings
Instruction tuning with 3D-GRAND improves grounding in 3D-LLMs.
Large-scale datasets significantly boost 3D-LLM performance.
Models trained on synthetic data transfer effectively to real-world 3D scans.
Abstract
The integration of language and 3D perception is crucial for embodied agents and robots that comprehend and interact with the physical world. While large language models (LLMs) have demonstrated impressive language understanding and generation capabilities, their adaptation to 3D environments (3D-LLMs) remains in its early stages. A primary challenge is a lack of large-scale datasets with dense grounding between language and 3D scenes. We introduce 3D-GRAND, a pioneering large-scale dataset comprising 40,087 household scenes paired with 6.2 million densely-grounded scene-language instructions. Our results show that instruction tuning with 3D-GRAND significantly enhances grounding capabilities and reduces hallucinations in 3D-LLMs. As part of our contributions, we propose a comprehensive benchmark 3D-POPE to systematically evaluate hallucination in 3D-LLMs, enabling fair comparisons of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMedical Imaging Techniques and Applications · Advanced X-ray and CT Imaging · Advanced Surface Polishing Techniques
