TL;DR
FUS3DMaps introduces a scalable dual-layer 3D semantic mapping method that fuses dense and instance-level layers for improved open-vocabulary mapping in large-scale scenes.
Contribution
It proposes a novel online dual-layer semantic mapping approach that jointly maintains and fuses dense and instance-level semantic layers within a shared voxel map.
Findings
Improves the quality of both dense and instance-level semantic layers.
Enables scalable, accurate open-vocabulary mapping in large-scale environments.
Abstract
Open-vocabulary semantic mapping enables robots to spatially ground previously unseen concepts without requiring predefined class sets. Current training-free methods commonly rely on multi-view fusion of semantic embeddings into a 3D map, either at the instance-level via segmenting views and encoding image crops of segments, or by projecting image patch embeddings directly into a dense semantic map. The latter approach sidesteps segmentation and 2D-to-3D instance association by operating on full uncropped image frames, but existing methods remain limited in scalability. We present FUS3DMaps, an online dual-layer semantic mapping method that jointly maintains both dense and instance-level open-vocabulary layers within a shared voxel map. This design enables further voxel-level semantic fusion of the layer embeddings, combining the complementary strengths of both semantic mapping…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
