Video Diffusion Models are Strong Video Inpainter

Minhyeok Lee; Suhwan Cho; Chajin Shin; Jungho Lee; Sunghun Yang,; Sangyoun Lee

arXiv:2408.11402·cs.CV·December 17, 2024

Video Diffusion Models are Strong Video Inpainter

Minhyeok Lee, Suhwan Cho, Chajin Shin, Jungho Lee, Sunghun Yang,, Sangyoun Lee

PDF

Open Access 1 Video

TL;DR

This paper introduces FFF-VDI, a novel video inpainting model that leverages pre-trained image-to-video diffusion models to produce more natural and temporally consistent videos, overcoming optical flow limitations.

Contribution

The paper presents the first integration of image-to-video diffusion models into video inpainting, enhancing quality and temporal consistency over existing propagation-based methods.

Findings

01

Outperforms optical flow-based methods in quality and consistency

02

Robustly handles diverse inpainting scenarios

03

Produces more natural and temporally coherent videos

Abstract

Propagation-based video inpainting using optical flow at the pixel or feature level has recently garnered significant attention. However, it has limitations such as the inaccuracy of optical flow prediction and the propagation of noise over time. These issues result in non-uniform noise and time consistency problems throughout the video, which are particularly pronounced when the removed area is large and involves substantial movement. To address these issues, we propose a novel First Frame Filling Video Diffusion Inpainting model (FFF-VDI). We design FFF-VDI inspired by the capabilities of pre-trained image-to-video diffusion models that can transform the first frame image into a highly natural video. To apply this to the video inpainting task, we propagate the noise latent information of future frames to fill the masked areas of the first frame's noise latent code. Next, we fine-tune…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Video Diffusion Models Are Strong Video Inpainter· underline

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Advanced Image Processing Techniques

MethodsDiffusion · Inpainting