VoxNeRF: Bridging Voxel Representation and Neural Radiance Fields for Enhanced Indoor View Synthesis
Sen Wang, Qing Cheng, Stefano Gasperini, Wei Zhang, Shun-Cheng Wu,, Niclas Zeller, Daniel Cremers, Nassir Navab

TL;DR
VoxNeRF introduces a voxel-guided sampling method and depth loss to improve indoor view synthesis, achieving higher quality and efficiency in real-time scenarios using geometry priors.
Contribution
It presents a novel voxel-guided sampling technique and depth loss that leverage geometry priors for faster, higher-quality indoor view synthesis.
Findings
Outperforms state-of-the-art methods on ScanNet datasets
Reduces training and rendering time significantly
Establishes new benchmarks for indoor view synthesis
Abstract
The generation of high-fidelity view synthesis is essential for robotic navigation and interaction but remains challenging, particularly in indoor environments and real-time scenarios. Existing techniques often require significant computational resources for both training and rendering, and they frequently result in suboptimal 3D representations due to insufficient geometric structuring. To address these limitations, we introduce VoxNeRF, a novel approach that utilizes easy-to-obtain geometry priors to enhance both the quality and efficiency of neural indoor reconstruction and novel view synthesis. We propose an efficient voxel-guided sampling technique that allocates computational resources selectively to the most relevant segments of rays based on a voxel-encoded geometry prior, significantly reducing training and rendering time. Additionally, we incorporate a robust depth loss to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · Computer Graphics and Visualization Techniques · Robotics and Sensor-Based Localization
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
