JVID: Joint Video-Image Diffusion for Visual-Quality and   Temporal-Consistency in Video Generation

Hadrien Reynaud; Matthew Baugh; Mischa Dombrowski; Sarah; Cechnicka; Qingjie Meng; Bernhard Kainz

arXiv:2409.14149·cs.CV·September 30, 2024

JVID: Joint Video-Image Diffusion for Visual-Quality and Temporal-Consistency in Video Generation

Hadrien Reynaud, Matthew Baugh, Mischa Dombrowski, Sarah, Cechnicka, Qingjie Meng, Bernhard Kainz

PDF

Open Access

TL;DR

JVID introduces a joint diffusion approach combining image and video models to generate high-quality, temporally consistent videos, significantly improving realism and coherence in video synthesis.

Contribution

The paper presents a novel joint diffusion framework that integrates image and video diffusion models for improved video quality and temporal consistency.

Findings

01

Enhanced video realism and coherence demonstrated.

02

Quantitative improvements over existing methods.

03

Qualitative analysis confirms better temporal stability.

Abstract

We introduce the Joint Video-Image Diffusion model (JVID), a novel approach to generating high-quality and temporally coherent videos. We achieve this by integrating two diffusion models: a Latent Image Diffusion Model (LIDM) trained on images and a Latent Video Diffusion Model (LVDM) trained on video data. Our method combines these models in the reverse diffusion process, where the LIDM enhances image quality and the LVDM ensures temporal consistency. This unique combination allows us to effectively handle the complex spatio-temporal dynamics in video generation. Our results demonstrate quantitative and qualitative improvements in producing realistic and coherent videos.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Vision and Imaging · Video Coding and Compression Technologies · Advanced Image Processing Techniques

MethodsDiffusion