TL;DR
PixelSynth introduces a novel method combining 3D reasoning and autoregressive modeling to generate immersive, large-angle view syntheses from a single image, maintaining 3D consistency and outperforming existing approaches.
Contribution
The paper presents a new approach that enables large view extrapolation from a single image while ensuring 3D consistency, advancing the capabilities of scene synthesis.
Findings
Significant improvement in large-angle view synthesis results.
Enhanced 3D consistency over existing methods.
Effective in both simulated and real datasets.
Abstract
Recent advancements in differentiable rendering and 3D reasoning have driven exciting results in novel view synthesis from a single image. Despite realistic results, methods are limited to relatively small view change. In order to synthesize immersive scenes, models must also be able to extrapolate. We present an approach that fuses 3D reasoning with autoregressive modeling to outpaint large view changes in a 3D-consistent manner, enabling scene synthesis. We demonstrate considerable improvement in single image large-angle view synthesis results compared to a variety of methods and possible variants across simulated and real datasets. In addition, we show increased 3D consistency compared to alternative accumulation methods. Project website: https://crockwell.github.io/pixelsynth/
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
