FreeLong: Training-Free Long Video Generation with SpectralBlend Temporal Attention
Yu Lu, Yuanzhi Liang, Linchao Zhu, Yi Yang

TL;DR
FreeLong is a training-free method that extends short video diffusion models to generate long videos with improved quality and consistency by balancing frequency components during denoising.
Contribution
It introduces a novel frequency blending technique that balances global and local features, enabling high-quality long video generation without additional training.
Findings
Significant quality improvements in long video generation.
Enhanced global consistency and local detail preservation.
Supports coherent multi-prompt scene transitions.
Abstract
Video diffusion models have made substantial progress in various video generation applications. However, training models for long video generation tasks require significant computational and data resources, posing a challenge to developing long video diffusion models. This paper investigates a straightforward and training-free approach to extend an existing short video diffusion model (e.g. pre-trained on 16-frame videos) for consistent long video generation (e.g. 128 frames). Our preliminary observation has found that directly applying the short video diffusion model to generate long videos can lead to severe video quality degradation. Further investigation reveals that this degradation is primarily due to the distortion of high-frequency components in long videos, characterized by a decrease in spatial high-frequency components and an increase in temporal high-frequency components.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · Generative Adversarial Networks and Image Synthesis · Human Pose and Action Recognition
MethodsDiffusion · Balanced Selection · Focus
