Learning 3D Photography Videos via Self-supervised Diffusion on Single   Images

Xiaodong Wang; Chenfei Wu; Shengming Yin; Minheng Ni; Jianfeng Wang,; Linjie Li; Zhengyuan Yang; Fan Yang; Lijuan Wang; Zicheng Liu; Yuejian Fang,; Nan Duan

arXiv:2302.10781·cs.CV·February 22, 2023

Learning 3D Photography Videos via Self-supervised Diffusion on Single Images

Xiaodong Wang, Chenfei Wu, Shengming Yin, Minheng Ni, Jianfeng Wang,, Linjie Li, Zhengyuan Yang, Fan Yang, Lijuan Wang, Zicheng Liu, Yuejian Fang,, Nan Duan

PDF

Open Access

TL;DR

This paper introduces a self-supervised diffusion-based inpainting method for 3D photography videos from single images, improving rendering quality without relying on out-of-domain training data.

Contribution

It proposes a novel self-supervised diffusion model with a Masked Enhanced Block for improved inpainting in 3D photography, enabling out-animation and better real-world results.

Findings

01

Achieves competitive results with state-of-the-art methods.

02

Effectively constructs training pairs without data annotation.

03

Demonstrates improved 3D video rendering quality.

Abstract

3D photography renders a static image into a video with appealing 3D visual effects. Existing approaches typically first conduct monocular depth estimation, then render the input frame to subsequent frames with various viewpoints, and finally use an inpainting model to fill those missing/occluded regions. The inpainting model plays a crucial role in rendering quality, but it is normally trained on out-of-domain data. To reduce the training and inference gap, we propose a novel self-supervised diffusion model as the inpainting module. Given a single input image, we automatically construct a training pair of the masked occluded image and the ground-truth image with random cycle-rendering. The constructed training samples are closely aligned to the testing instances, without the need of data annotation. To make full use of the masked images, we design a Masked Enhanced Block (MEB), which…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Vision and Imaging · Advanced Image Processing Techniques · Generative Adversarial Networks and Image Synthesis

MethodsDiffusion · Inpainting