Mega-NeRF: Scalable Construction of Large-Scale NeRFs for Virtual Fly-Throughs
Haithem Turki, Deva Ramanan, Mahadev Satyanarayanan

TL;DR
Mega-NeRF enables scalable, high-quality, and fast rendering of large-scale 3D scenes from drone-captured data by employing specialized network structures and data parallelism, significantly improving training and rendering efficiency.
Contribution
The paper introduces a sparse, region-specialized neural radiance field architecture and a geometric clustering algorithm for data parallelism, addressing large-scale scene modeling challenges.
Findings
Training speed improved by 3x
PSNR increased by 12%
Achieves 40x faster rendering with minimal quality loss
Abstract
We use neural radiance fields (NeRFs) to build interactive 3D environments from large-scale visual captures spanning buildings or even multiple city blocks collected primarily from drones. In contrast to single object scenes (on which NeRFs are traditionally evaluated), our scale poses multiple challenges including (1) the need to model thousands of images with varying lighting conditions, each of which capture only a small subset of the scene, (2) prohibitively large model capacities that make it infeasible to train on a single GPU, and (3) significant challenges for fast rendering that would enable interactive fly-throughs. To address these challenges, we begin by analyzing visibility statistics for large-scale scenes, motivating a sparse network structure where parameters are specialized to different regions of the scene. We introduce a simple geometric clustering algorithm for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · Computer Graphics and Visualization Techniques · Remote Sensing and LiDAR Applications
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
