Enhancing Long Video Generation Consistency without Tuning
Xingyao Li, Fengzhuo Zhang, Jiachun Pan, Yunlong Hou, Vincent Y. F. Tan, Zhuoran Yang

TL;DR
This paper introduces TiARA, a frequency-based attention reweighting algorithm, and PromptBlend, a prompt alignment method, to significantly improve the consistency and coherence of long video generation without additional tuning.
Contribution
It presents the first frequency-based attention editing technique for video diffusion models and a novel prompt interpolation pipeline for better prompt alignment.
Findings
TiARA improves video frame consistency and smoothness.
PromptBlend enhances prompt interpolation quality.
Experimental results show substantial improvements over baselines.
Abstract
Despite the considerable progress achieved in the long video generation problem, there is still significant room to improve the consistency of the generated videos, particularly in terms of their smoothness and transitions between scenes. We address these issues to enhance the consistency and coherence of videos generated with either single or multiple prompts. We propose the Time-frequency based temporal Attention Reweighting Algorithm (TiARA), which judiciously edits the attention score matrix based on the Discrete Short-Time Fourier Transform. This method is supported by a frequency-based analysis, ensuring that the edited attention score matrix achieves improved consistency across frames. It represents the first-of-its-kind for frequency-based methods in video diffusion models. For videos generated by multiple prompts, we further uncover key factors such as the alignment of the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimedia Communication and Technology · Video Coding and Compression Technologies · Video Analysis and Summarization
MethodsSoftmax · Attention Is All You Need · Diffusion
