FreeQ-Graph: Free-form Querying with Semantic Consistent Scene Graph for 3D Scene Understanding
Chenlu Zhan, Yufei Zhang, Gaoang Wang, Hongwei Wang

TL;DR
FreeQ-Graph introduces a novel approach for free-form semantic querying in 3D scenes by constructing a complete scene graph and aligning it with accurate semantic labels without relying on predefined vocabularies, enabling advanced reasoning and querying.
Contribution
The paper presents a method to generate a complete 3D scene graph guided by LLM and LVLM, aligning semantic labels without training data, and enabling free-form querying with scene-level reasoning.
Findings
Outperforms existing methods in 3D semantic grounding and segmentation.
Excels in complex free-form semantic queries and relational reasoning.
Validated on 6 datasets demonstrating robustness and accuracy.
Abstract
Semantic querying in complex 3D scenes through free-form language presents a significant challenge. Existing 3D scene understanding methods use large-scale training data and CLIP to align text queries with 3D semantic features. However, their reliance on predefined vocabulary priors from training data hinders free-form semantic querying. Besides, recent advanced methods rely on LLMs for scene understanding but lack comprehensive 3D scene-level information and often overlook the potential inconsistencies in LLM-generated outputs. In our paper, we propose FreeQ-Graph, which enables Free-form Querying with a semantic consistent scene Graph for 3D scene understanding. The core idea is to encode free-form queries from a complete and accurate 3D scene graph without predefined vocabularies, and to align them with 3D consistent semantic labels, which accomplished through three key steps. We…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
Topics3D Shape Modeling and Analysis · Computer Graphics and Visualization Techniques · Image Processing and 3D Reconstruction
MethodsALIGN · Contrastive Language-Image Pre-training
