MegaScenes: Scene-Level View Synthesis at Scale
Joseph Tung, Gene Chou, Ruojin Cai, Guandao Yang, Kai Zhang, Gordon, Wetzstein, Bharath Hariharan, Noah Snavely

TL;DR
This paper introduces MegaScenes, a large-scale scene-level dataset from Internet photos, and proposes methods to improve view synthesis consistency, enabling better in-the-wild scene generation.
Contribution
The paper creates MegaScenes, a diverse dataset from Internet photos, and enhances scene-level view synthesis methods for more consistent in-the-wild scene generation.
Findings
MegaScenes contains over 100K SfM reconstructions from Internet photos.
Significant improvements in generation consistency over state-of-the-art methods.
Validated effectiveness on diverse real-world scenes.
Abstract
Scene-level novel view synthesis (NVS) is fundamental to many vision and graphics applications. Recently, pose-conditioned diffusion models have led to significant progress by extracting 3D information from 2D foundation models, but these methods are limited by the lack of scene-level training data. Common dataset choices either consist of isolated objects (Objaverse), or of object-centric scenes with limited pose distributions (DTU, CO3D). In this paper, we create a large-scale scene-level dataset from Internet photo collections, called MegaScenes, which contains over 100K structure from motion (SfM) reconstructions from around the world. Internet photos represent a scalable data source but come with challenges such as lighting and transient objects. We address these issues to further create a subset suitable for the task of NVS. Additionally, we analyze failure cases of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComputer Graphics and Visualization Techniques · Generative Adversarial Networks and Image Synthesis
MethodsDiffusion
