VoxGRAF: Fast 3D-Aware Image Synthesis with Sparse Voxel Grids
Katja Schwarz, Axel Sauer, Michael Niemeyer, Yiyi Liao and, Andreas Geiger

TL;DR
VoxGRAF introduces a fast, 3D-aware image synthesis method using sparse voxel grids and 3D convolutions, enabling high-fidelity, viewpoint-consistent rendering with a single forward pass.
Contribution
The paper proposes replacing coordinate-based MLPs with sparse voxel grids and 3D convolutions for efficient, high-resolution 3D-aware image synthesis, disentangling foreground and background.
Findings
Single forward pass generates full 3D scenes efficiently.
Achieves high visual fidelity and 3D consistency.
Supports arbitrary viewpoint rendering.
Abstract
State-of-the-art 3D-aware generative models rely on coordinate-based MLPs to parameterize 3D radiance fields. While demonstrating impressive results, querying an MLP for every sample along each ray leads to slow rendering. Therefore, existing approaches often render low-resolution feature maps and process them with an upsampling network to obtain the final image. Albeit efficient, neural rendering often entangles viewpoint and content such that changing the camera pose results in unwanted changes of geometry or appearance. Motivated by recent results in voxel-based novel view synthesis, we investigate the utility of sparse voxel grid representations for fast and 3D-consistent generative modeling in this paper. Our results demonstrate that monolithic MLPs can indeed be replaced by 3D convolutions when combining sparse voxel grids with progressive growing, free space pruning and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsAdvanced Vision and Imaging · Computer Graphics and Visualization Techniques · Generative Adversarial Networks and Image Synthesis
MethodsPruning
