Point2Graph: An End-to-end Point Cloud-based 3D Open-Vocabulary Scene Graph for Robot Navigation
Yifan Xu, Ziming Luo, Qianwei Wang, Vineet Kamat, and Carol Menassa

TL;DR
Point2Graph introduces an end-to-end 3D scene graph generation method from point clouds, eliminating the need for RGB-D images or camera poses, and achieves state-of-the-art results in open-vocabulary scene understanding.
Contribution
The paper presents a novel hierarchical framework that combines geometry-based and learning-based methods for open-vocabulary scene graph generation directly from point clouds.
Findings
Outperforms current SOTA in open-vocabulary room and object segmentation.
Eliminates dependence on RGB-D images and camera pose data.
Provides an end-to-end pipeline for 3D scene understanding from point clouds.
Abstract
Current open-vocabulary scene graph generation algorithms highly rely on both 3D scene point cloud data and posed RGB-D images and thus have limited applications in scenarios where RGB-D images or camera poses are not readily available. To solve this problem, we propose Point2Graph, a novel end-to-end point cloud-based 3D open-vocabulary scene graph generation framework in which the requirement of posed RGB-D image series is eliminated. This hierarchical framework contains room and object detection/segmentation and open-vocabulary classification. For the room layer, we leverage the advantage of merging the geometry-based border detection algorithm with the learning-based region detection to segment rooms and create a "Snap-Lookup" framework for open-vocabulary room classification. In addition, we create an end-to-end pipeline for the object layer to detect and classify 3D objects based…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobotics and Sensor-Based Localization · Multimodal Machine Learning Applications · Robotic Path Planning Algorithms
