Improving Motion in Image-to-Video Models via Adaptive Low-Pass Guidance
June Suk Choi, Kyungmin Lee, Sihyun Yu, Yisol Choi, Jinwoo Shin, Kimin Lee

TL;DR
This paper introduces adaptive low-pass guidance (ALG), a training-free method that enhances motion dynamics in image-to-video generation by modulating frequency content during sampling, without sacrificing image quality.
Contribution
The work identifies the cause of static outputs in I2V models and proposes ALG, a simple, training-free technique to improve temporal dynamics in generated videos.
Findings
ALG significantly increases video motion dynamics by 33% on average.
ALG preserves or improves image fidelity and text alignment.
The method is training-free and easy to integrate into existing I2V models.
Abstract
Recent text-to-video (T2V) models have demonstrated strong capabilities in producing high-quality, dynamic videos. To improve the visual controllability, recent works have considered fine-tuning pre-trained T2V models to support image-to-video (I2V) generation. However, such adaptation frequently suppresses motion dynamics of generated outputs, resulting in more static videos compared to their T2V counterparts. In this work, we analyze this phenomenon and identify that it stems from the premature exposure to high-frequency details in the input image, which biases the sampling process toward a shortcut trajectory that overfits to the static appearance of the reference image. To address this, we propose adaptive low-pass guidance (ALG), a simple training-free fix to the I2V model sampling procedure to generate more dynamic videos without compromising per-frame image quality. Specifically,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Multimodal Machine Learning Applications · Advanced Vision and Imaging
