Direct Voxel Grid Optimization: Super-fast Convergence for Radiance   Fields Reconstruction

Cheng Sun; Min Sun; Hwann-Tzong Chen

arXiv:2111.11215·cs.CV·June 6, 2022·37 cites

Direct Voxel Grid Optimization: Super-fast Convergence for Radiance Fields Reconstruction

Cheng Sun, Min Sun, Hwann-Tzong Chen

PDF

Open Access 2 Repos

TL;DR

This paper introduces a rapid convergence method for scene radiance field reconstruction using voxel grids, achieving NeRF-like quality in under 15 minutes with a single GPU.

Contribution

It proposes a novel voxel grid-based approach with post-activation interpolation and priors, enabling fast and high-quality scene reconstruction.

Findings

01

Achieves NeRF-comparable quality in less than 15 minutes.

02

Uses explicit voxel representations for geometry and appearance.

03

Outperforms existing methods in convergence speed and quality.

Abstract

We present a super-fast convergence approach to reconstructing the per-scene radiance field from a set of images that capture the scene with known poses. This task, which is often applied to novel view synthesis, is recently revolutionized by Neural Radiance Field (NeRF) for its state-of-the-art quality and flexibility. However, NeRF and its variants require a lengthy training time ranging from hours to days for a single scene. In contrast, our approach achieves NeRF-comparable quality and converges rapidly from scratch in less than 15 minutes with a single GPU. We adopt a representation consisting of a density voxel grid for scene geometry and a feature voxel grid with a shallow network for complex view-dependent appearance. Modeling with explicit and discretized volume representations is not new, but we propose two simple yet non-trivial techniques that contribute to fast convergence…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Vision and Imaging · Computer Graphics and Visualization Techniques · 3D Shape Modeling and Analysis

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings