DiffDreamer: Towards Consistent Unsupervised Single-view Scene   Extrapolation with Conditional Diffusion Models

Shengqu Cai; Eric Ryan Chan; Songyou Peng; Mohamad Shahbazi; Anton; Obukhov; Luc Van Gool; Gordon Wetzstein

arXiv:2211.12131·cs.CV·March 21, 2023·1 cites

DiffDreamer: Towards Consistent Unsupervised Single-view Scene Extrapolation with Conditional Diffusion Models

Shengqu Cai, Eric Ryan Chan, Songyou Peng, Mohamad Shahbazi, Anton, Obukhov, Luc Van Gool, Gordon Wetzstein

PDF

Open Access

TL;DR

DiffDreamer introduces an unsupervised diffusion model framework capable of generating consistent long-range scene extrapolations from single images, outperforming prior GAN-based methods in maintaining scene coherence.

Contribution

The paper presents DiffDreamer, a novel unsupervised diffusion-based approach for long-range scene extrapolation that leverages multiple frames for conditioning, improving consistency and quality.

Findings

01

Outperforms GAN-based methods in scene consistency

02

Effective with limited supervision and internet-collected images

03

Capable of synthesizing long camera trajectories

Abstract

Scene extrapolation -- the idea of generating novel views by flying into a given image -- is a promising, yet challenging task. For each predicted frame, a joint inpainting and 3D refinement problem has to be solved, which is ill posed and includes a high level of ambiguity. Moreover, training data for long-range scenes is difficult to obtain and usually lacks sufficient views to infer accurate camera poses. We introduce DiffDreamer, an unsupervised framework capable of synthesizing novel views depicting a long camera trajectory while training solely on internet-collected images of nature scenes. Utilizing the stochastic nature of the guided denoising steps, we train the diffusion models to refine projected RGBD images but condition the denoising steps on multiple past and future frames for inference. We demonstrate that image-conditioned diffusion models can effectively perform…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Vision and Imaging · Generative Adversarial Networks and Image Synthesis · Advanced Image Processing Techniques

MethodsInpainting · Diffusion