NaviNeRF: NeRF-based 3D Representation Disentanglement by Latent Semantic Navigation
Baao Xie, Bohan Li, Zequn Zhang, Junting Dong, Xin Jin, Jingyu Yang,, Wenjun Zeng

TL;DR
NaviNeRF introduces a novel self-supervised method for fine-grained 3D disentanglement using NeRF, enabling interpretable semantic manipulation without priors or supervision.
Contribution
It is the first approach to achieve detailed 3D disentanglement with NeRF through a dual-branch navigation mechanism without requiring prior knowledge.
Findings
Outperforms previous 3D-aware models in disentanglement quality.
Achieves comparable results to prior models with semantic or geometric priors.
Demonstrates effective global and fine-grained semantic control in 3D representations.
Abstract
3D representation disentanglement aims to identify, decompose, and manipulate the underlying explanatory factors of 3D data, which helps AI fundamentally understand our 3D world. This task is currently under-explored and poses great challenges: (i) the 3D representations are complex and in general contains much more information than 2D image; (ii) many 3D representations are not well suited for gradient-based optimization, let alone disentanglement. To address these challenges, we use NeRF as a differentiable 3D representation, and introduce a self-supervised Navigation to identify interpretable semantic directions in the latent space. To our best knowledge, this novel method, dubbed NaviNeRF, is the first work to achieve fine-grained 3D disentanglement without any priors or supervisions. Specifically, NaviNeRF is built upon the generative NeRF pipeline, and equipped with an Outer…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
Topics3D Shape Modeling and Analysis · Advanced Vision and Imaging · Human Pose and Action Recognition
