QueSTMaps: Queryable Semantic Topological Maps for 3D Scene Understanding
Yash Mehan, Kumaraditya Gupta, Rohit Jayanti, Anirudh Govil, Sourav, Garg, Madhava Krishna

TL;DR
QueSTMaps introduces a novel pipeline for 3D scene understanding that combines topological mapping with semantic labeling, enabling natural language queries and outperforming existing methods in room segmentation and classification.
Contribution
The paper presents a new two-step approach that constructs topological maps and generates CLIP-aligned semantic features for improved 3D scene understanding.
Findings
Outperforms state-of-the-art in room segmentation by ~20%
Achieves ~12% improvement in room classification
Supports natural language queries for scene navigation
Abstract
Robotic tasks such as planning and navigation require a hierarchical semantic understanding of a scene, which could include multiple floors and rooms. Current methods primarily focus on object segmentation for 3D scene understanding. However, such methods struggle to segment out topological regions like "kitchen" in the scene. In this work, we introduce a two-step pipeline to solve this problem. First, we extract a topological map, i.e., floorplan of the indoor scene using a novel multi-channel occupancy representation. Then, we generate CLIP-aligned features and semantic labels for every room instance based on the objects it contains using a self-attention transformer. Our language-topology alignment supports natural language querying, e.g., a "place to cook" locates the "kitchen". We outperform the current state-of-the-art on room segmentation by ~20% and room classification by ~12%.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image and Video Retrieval Techniques · Computer Graphics and Visualization Techniques · Robotics and Sensor-Based Localization
MethodsFocus
