DISC: Dense Integrated Semantic Context for Large-Scale Open-Set Semantic Mapping

Felix Igelbrink; Lennart Niecksch; Martin Atzmueller; Joachim Hertzberg

arXiv:2603.03935·cs.CV·March 5, 2026

DISC: Dense Integrated Semantic Context for Large-Scale Open-Set Semantic Mapping

Felix Igelbrink, Lennart Niecksch, Martin Atzmueller, Joachim Hertzberg

PDF

Open Access

TL;DR

DISC introduces a novel, efficient, and high-fidelity semantic mapping method that leverages dense, integrated context and CLIP embeddings for large-scale, real-time robotic perception in complex environments.

Contribution

The paper presents a single-pass, GPU-accelerated semantic mapping approach that derives high-quality CLIP embeddings from intermediate transformer layers, eliminating crop-based extraction limitations.

Findings

01

Outperforms state-of-the-art zero-shot methods in accuracy and retrieval.

02

Demonstrates scalability across large, complex indoor scenes.

03

Enables real-time, dense semantic mapping for robotic applications.

Abstract

Open-set semantic mapping enables language-driven robotic perception, but current instance-centric approaches are bottlenecked by context-depriving and computationally expensive crop-based feature extraction. To overcome this fundamental limitation, we introduce DISC (Dense Integrated Semantic Context), featuring a novel single-pass, distance-weighted extraction mechanism. By deriving high-fidelity CLIP embeddings directly from the vision transformer's intermediate layers, our approach eliminates the latency and domain-shift artifacts of traditional image cropping, yielding pure, mask-aligned semantic representations. To fully leverage these features in large-scale continuous mapping, DISC is built upon a fully GPU-accelerated architecture that replaces periodic offline processing with precise, on-the-fly voxel-level instance refinement. We evaluate our approach on standard benchmarks…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Multimodal Machine Learning Applications · Robotics and Sensor-Based Localization