Decomposing NeRF for Editing via Feature Field Distillation
Sosuke Kobayashi, Eiichi Matsumoto, Vincent Sitzmann

TL;DR
This paper introduces a method to decompose NeRFs into semantic regions using distilled 3D feature fields from 2D vision models, enabling targeted editing without retraining.
Contribution
It proposes a novel approach to semantically decompose NeRFs via feature field distillation, facilitating query-based local editing of 3D scenes.
Findings
Distilled feature fields enable effective 3D scene segmentation.
The method allows semantic editing based on text, images, or clicks.
It transfers 2D vision and language models to 3D scene editing.
Abstract
Emerging neural radiance fields (NeRF) are a promising scene representation for computer graphics, enabling high-quality 3D reconstruction and novel view synthesis from image observations. However, editing a scene represented by a NeRF is challenging, as the underlying connectionist representations such as MLPs or voxel grids are not object-centric or compositional. In particular, it has been difficult to selectively edit specific regions or objects. In this work, we tackle the problem of semantic scene decomposition of NeRFs to enable query-based local editing of the represented 3D scenes. We propose to distill the knowledge of off-the-shelf, self-supervised 2D image feature extractors such as CLIP-LSeg or DINO into a 3D feature field optimized in parallel to the radiance field. Given a user-specified query of various modalities such as text, an image patch, or a point-and-click…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsAdvanced Vision and Imaging · Advanced Neural Network Applications · 3D Shape Modeling and Analysis
MethodsMulti-Head Attention · Attention Is All You Need · Softmax · Layer Normalization · Linear Layer · Dense Connections · Residual Connection · Vision Transformer
