Rethinking Open-Vocabulary Segmentation of Radiance Fields in 3D Space
Hyunjee Lee, Youngsik Yun, Jeongmin Bae, Seoha Kim, Youngjung Uh

TL;DR
This paper introduces a novel approach for 3D semantic segmentation of radiance fields, enabling complete 3D understanding and real-time rendering, surpassing previous 2D mask-based methods.
Contribution
It redefines 3D segmentation in radiance fields, directly supervises 3D points, and achieves real-time rendering with improved semantic understanding.
Findings
First real-time 3D semantic rendering of radiance fields.
Supervised 3D points improve segmentation accuracy.
New protocol for joint geometry and semantics evaluation.
Abstract
Understanding the 3D semantics of a scene is a fundamental problem for various scenarios such as embodied agents. While NeRFs and 3DGS excel at novel-view synthesis, previous methods for understanding their semantics have been limited to incomplete 3D understanding: their segmentation results are rendered as 2D masks that do not represent the entire 3D space. To address this limitation, we redefine the problem to segment the 3D volume and propose the following methods for better 3D understanding. We directly supervise the 3D points to train the language embedding field, unlike previous methods that anchor supervision at 2D pixels. We transfer the learned language field to 3DGS, achieving the first real-time rendering speed without sacrificing training time or accuracy. Lastly, we introduce a 3D querying and evaluation protocol for assessing the reconstructed geometry and semantics…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
Topics3D Surveying and Cultural Heritage
MethodsSparse Evolutionary Training · SPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
