CoNo: Consistency Noise Injection for Tuning-free Long Video Diffusion

Xingrui Wang; Xin Li; Zhibo Chen

arXiv:2406.05082·cs.CV·June 10, 2024

CoNo: Consistency Noise Injection for Tuning-free Long Video Diffusion

Xingrui Wang, Xin Li, Zhibo Chen

PDF

Open Access

TL;DR

CoNo introduces a novel noise injection method with a look-back mechanism and long-term regularization to improve scene consistency in long video diffusion without retraining.

Contribution

It proposes the CoNo method, enhancing long video generation by modeling fine-grained scene transitions and maintaining content consistency without additional training.

Findings

01

Improves scene consistency in long video generation.

02

Effective under single- and multi-text prompts.

03

Reduces abrupt scene transitions.

Abstract

Tuning-free long video diffusion has been proposed to generate extended-duration videos with enriched content by reusing the knowledge from pre-trained short video diffusion model without retraining. However, most works overlook the fine-grained long-term video consistency modeling, resulting in limited scene consistency (i.e., unreasonable object or background transitions), especially with multiple text inputs. To mitigate this, we propose the Consistency Noise Injection, dubbed CoNo, which introduces the "look-back" mechanism to enhance the fine-grained scene transition between different video clips, and designs the long-term consistency regularization to eliminate the content shifts when extending video contents through noise prediction. In particular, the "look-back" mechanism breaks the noise scheduling process into three essential parts, where one internal noise prediction part is…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsIntegrated Circuits and Semiconductor Failure Analysis · Electrostatic Discharge in Electronics · 3D IC and TSV technologies

MethodsContrastive Language-Image Pre-training · Diffusion