TI2V-Zero: Zero-Shot Image Conditioning for Text-to-Video Diffusion Models
Haomiao Ni, Bernhard Egger, Suhas Lohit, Anoop Cherian, Ye Wang,, Toshiaki Koike-Akino, Sharon X. Huang, Tim K. Marks

TL;DR
TI2V-Zero is a zero-shot, tuning-free method that enables a pretrained text-to-video diffusion model to generate videos conditioned on an input image and text, without additional training or external modules.
Contribution
It introduces a novel zero-shot approach using a 'repeat-and-slide' strategy and inversion techniques to condition on images in text-to-video generation without fine-tuning.
Findings
Outperforms recent open-domain TI2V models on various datasets.
Supports video infilling, prediction, and long video generation.
Operates without optimization or external modules.
Abstract
Text-conditioned image-to-video generation (TI2V) aims to synthesize a realistic video starting from a given image (e.g., a woman's photo) and a text description (e.g., "a woman is drinking water."). Existing TI2V frameworks often require costly training on video-text datasets and specific model designs for text and image conditioning. In this paper, we propose TI2V-Zero, a zero-shot, tuning-free method that empowers a pretrained text-to-video (T2V) diffusion model to be conditioned on a provided image, enabling TI2V generation without any optimization, fine-tuning, or introducing external modules. Our approach leverages a pretrained T2V diffusion foundation model as the generative prior. To guide video generation with the additional image input, we propose a "repeat-and-slide" strategy that modulates the reverse denoising process, allowing the frozen diffusion model to synthesize a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image Processing Techniques · Radiomics and Machine Learning in Medical Imaging · Mycobacterium research and diagnosis
MethodsDiffusion
