VORTA: Efficient Video Diffusion via Routing Sparse Attention
Wenhao Sun, Rong-Cheng Tu, Yifu Ding, Zhao Jin, Jingyi Liao, Shunyu Liu, Dacheng Tao

TL;DR
VORTA introduces a novel sparse attention and routing strategy to significantly accelerate video diffusion transformers, achieving up to 14.41x speedup with minimal quality loss.
Contribution
It presents a new acceleration framework combining sparse attention and adaptive routing, improving efficiency of video diffusion models without sacrificing quality.
Findings
Achieves 1.76x speedup without quality loss on VBench.
Can be combined with other methods for up to 14.41x speedup.
Demonstrates practical efficiency improvements for real-world applications.
Abstract
Video diffusion transformers have achieved remarkable progress in high-quality video generation, but remain computationally expensive due to the quadratic complexity of attention over high-dimensional video sequences. Recent acceleration methods enhance the efficiency by exploiting the local sparsity of attention scores; yet they often struggle with accelerating the long-range computation. To address this problem, we propose VORTA, an acceleration framework with two novel components: 1) a sparse attention mechanism that efficiently captures long-range dependencies, and 2) a routing strategy that adaptively replaces full 3D attention with specialized sparse attention variants. VORTA achieves an end-to-end speedup without loss of quality on VBench. Furthermore, it can seamlessly integrate with various other acceleration methods, such as model caching and step distillation,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Coding and Compression Technologies · Advanced Data Compression Techniques · Advanced Image Processing Techniques
MethodsSoftmax · Attention Is All You Need · Diffusion
