Direct Voxel Grid Optimization: Super-fast Convergence for Radiance Fields Reconstruction
Cheng Sun, Min Sun, Hwann-Tzong Chen

TL;DR
This paper introduces a rapid convergence method for scene radiance field reconstruction using voxel grids, achieving NeRF-like quality in under 15 minutes with a single GPU.
Contribution
It proposes a novel voxel grid-based approach with post-activation interpolation and priors, enabling fast and high-quality scene reconstruction.
Findings
Achieves NeRF-comparable quality in less than 15 minutes.
Uses explicit voxel representations for geometry and appearance.
Outperforms existing methods in convergence speed and quality.
Abstract
We present a super-fast convergence approach to reconstructing the per-scene radiance field from a set of images that capture the scene with known poses. This task, which is often applied to novel view synthesis, is recently revolutionized by Neural Radiance Field (NeRF) for its state-of-the-art quality and flexibility. However, NeRF and its variants require a lengthy training time ranging from hours to days for a single scene. In contrast, our approach achieves NeRF-comparable quality and converges rapidly from scratch in less than 15 minutes with a single GPU. We adopt a representation consisting of a density voxel grid for scene geometry and a feature voxel grid with a shallow network for complex view-dependent appearance. Modeling with explicit and discretized volume representations is not new, but we propose two simple yet non-trivial techniques that contribute to fast convergence…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · Computer Graphics and Visualization Techniques · 3D Shape Modeling and Analysis
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
