Mega-NeRF: Scalable Construction of Large-Scale NeRFs for Virtual   Fly-Throughs

Haithem Turki; Deva Ramanan; Mahadev Satyanarayanan

arXiv:2112.10703·cs.CV·March 30, 2022·21 cites

Mega-NeRF: Scalable Construction of Large-Scale NeRFs for Virtual Fly-Throughs

Haithem Turki, Deva Ramanan, Mahadev Satyanarayanan

PDF

Open Access 2 Repos

TL;DR

Mega-NeRF enables scalable, high-quality, and fast rendering of large-scale 3D scenes from drone-captured data by employing specialized network structures and data parallelism, significantly improving training and rendering efficiency.

Contribution

The paper introduces a sparse, region-specialized neural radiance field architecture and a geometric clustering algorithm for data parallelism, addressing large-scale scene modeling challenges.

Findings

01

Training speed improved by 3x

02

PSNR increased by 12%

03

Achieves 40x faster rendering with minimal quality loss

Abstract

We use neural radiance fields (NeRFs) to build interactive 3D environments from large-scale visual captures spanning buildings or even multiple city blocks collected primarily from drones. In contrast to single object scenes (on which NeRFs are traditionally evaluated), our scale poses multiple challenges including (1) the need to model thousands of images with varying lighting conditions, each of which capture only a small subset of the scene, (2) prohibitively large model capacities that make it infeasible to train on a single GPU, and (3) significant challenges for fast rendering that would enable interactive fly-throughs. To address these challenges, we begin by analyzing visibility statistics for large-scale scenes, motivating a sparse network structure where parameters are specialized to different regions of the scene. We introduce a simple geometric clustering algorithm for…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Vision and Imaging · Computer Graphics and Visualization Techniques · Remote Sensing and LiDAR Applications

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings