Open-Fusion: Real-time Open-Vocabulary 3D Mapping and Queryable Scene Representation
Kashu Yamazaki, Taisei Hanyu, Khoa Vo, Thang Pham, Minh Tran,, Gianfranco Doretto, Anh Nguyen, Ngan Le

TL;DR
Open-Fusion introduces a real-time 3D mapping method that combines vision-language models and TSDF for open-vocabulary scene understanding without extra 3D training.
Contribution
It presents a novel integration of VLFM and TSDF for open-vocabulary, real-time 3D scene mapping and segmentation without additional 3D training.
Findings
Outperforms existing zero-shot methods on ScanNet
Enables annotation-free 3D segmentation for open-vocabulary
Provides real-time 3D scene comprehension combining VLFM and TSDF
Abstract
Precise 3D environmental mapping is pivotal in robotics. Existing methods often rely on predefined concepts during training or are time-intensive when generating semantic maps. This paper presents Open-Fusion, a groundbreaking approach for real-time open-vocabulary 3D mapping and queryable scene representation using RGB-D data. Open-Fusion harnesses the power of a pre-trained vision-language foundation model (VLFM) for open-set semantic comprehension and employs the Truncated Signed Distance Function (TSDF) for swift 3D scene reconstruction. By leveraging the VLFM, we extract region-based embeddings and their associated confidence maps. These are then integrated with 3D knowledge from TSDF using an enhanced Hungarian-based feature-matching mechanism. Notably, Open-Fusion delivers outstanding annotation-free 3D segmentation for open-vocabulary without necessitating additional 3D…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Robotics and Sensor-Based Localization · Advanced Image and Video Retrieval Techniques
