IV-Mixed Sampler: Leveraging Image Diffusion Models for Enhanced Video Synthesis
Shitong Shao, Zikai Zhou, Lichen Bai, Haoyi Xiong, Zeke Xie

TL;DR
The paper introduces IV-Mixed Sampler, a training-free algorithm that leverages image diffusion models to significantly improve video synthesis quality while maintaining temporal coherence, achieving state-of-the-art results on multiple benchmarks.
Contribution
It presents a novel inference scaling method that combines image and video diffusion models without additional training to enhance video quality.
Findings
Achieves state-of-the-art performance on 4 video benchmarks.
Reduces FVD scores significantly compared to previous methods.
Demonstrates effective enhancement of video frame quality and coherence.
Abstract
The multi-step sampling mechanism, a key feature of visual diffusion models, has significant potential to replicate the success of OpenAI's Strawberry in enhancing performance by increasing the inference computational cost. Sufficient prior studies have demonstrated that correctly scaling up computation in the sampling process can successfully lead to improved generation quality, enhanced image editing, and compositional generalization. While there have been rapid advancements in developing inference-heavy algorithms for improved image generation, relatively little work has explored inference scaling laws in video diffusion models (VDMs). Furthermore, existing research shows only minimal performance gains that are perceptible to the naked eye. To address this, we design a novel training-free algorithm IV-Mixed Sampler that leverages the strengths of image diffusion models (IDMs) to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · Generative Adversarial Networks and Image Synthesis · Image and Signal Denoising Methods
MethodsDiffusion
