Fine-gained Zero-shot Video Sampling

Dengsheng Chen; Jie Hu; Xiaoming Wei; Enhua Wu

arXiv:2407.21475·cs.CV·August 1, 2024

Fine-gained Zero-shot Video Sampling

Dengsheng Chen, Jie Hu, Xiaoming Wei, Enhua Wu

PDF

Open Access

TL;DR

The paper introduces $\\mathcal{ZS}^2$, a zero-shot video sampling method that generates high-quality, fine-grained videos from image diffusion models without training, outperforming some supervised approaches.

Contribution

It presents a novel zero-shot video sampling algorithm that leverages dependency noise and temporal momentum attention to produce detailed videos from image models.

Findings

01

Achieves state-of-the-art zero-shot video generation performance.

02

Outperforms some recent supervised methods.

03

Enables high-quality, fine-grained video synthesis from images.

Abstract

Incorporating a temporal dimension into pretrained image diffusion models for video generation is a prevalent approach. However, this method is computationally demanding and necessitates large-scale video datasets. More critically, the heterogeneity between image and video datasets often results in catastrophic forgetting of the image expertise. Recent attempts to directly extract video snippets from image diffusion models have somewhat mitigated these problems. Nevertheless, these methods can only generate brief video clips with simple movements and fail to capture fine-grained motion or non-grid deformation. In this paper, we propose a novel Zero-Shot video Sampling algorithm, denoted as $ZS^{2}$ , capable of directly sampling high-quality video clips from existing image synthesis methods, such as Stable Diffusion, without any training or optimization. Specifically,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPhotoacoustic and Ultrasonic Imaging

MethodsSoftmax · Attention Is All You Need · Diffusion