RainFusion: Adaptive Video Generation Acceleration via Multi-Dimensional Visual Redundancy

Aiyue Chen; Bin Dong; Jingru Li; Jing Lin; Kun Tian; Yiwu Yao; Gongyi Wang

arXiv:2505.21036·cs.CV·June 10, 2025

RainFusion: Adaptive Video Generation Acceleration via Multi-Dimensional Visual Redundancy

Aiyue Chen, Bin Dong, Jingru Li, Jing Lin, Kun Tian, Yiwu Yao, Gongyi Wang

PDF

Open Access

TL;DR

RainFusion introduces a training-free sparse attention method that exploits visual data sparsity to significantly accelerate 3D attention in video generation models while preserving quality.

Contribution

It proposes a novel, plug-and-play sparse attention technique with an adaptive recognition module that accelerates video generation without additional training.

Findings

01

Over 2x speedup in attention computation

02

Maintains video quality with minimal quality score impact

03

Applicable to multiple state-of-the-art models

Abstract

Video generation using diffusion models is highly computationally intensive, with 3D attention in Diffusion Transformer (DiT) models accounting for over 80\% of the total computational resources. In this work, we introduce {\bf RainFusion}, a novel training-free sparse attention method that exploits inherent sparsity nature in visual data to accelerate attention computation while preserving video quality. Specifically, we identify three unique sparse patterns in video generation attention calculations--Spatial Pattern, Temporal Pattern and Textural Pattern. The sparse pattern for each attention head is determined online with negligible overhead (\textasciitilde\,0.2\%) with our proposed {\bf ARM} (Adaptive Recognition Module) during inference. Our proposed {\bf RainFusion} is a plug-and-play method, that can be seamlessly integrated into state-of-the-art 3D-attention video generation…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsImage Enhancement Techniques · Video Coding and Compression Technologies · Advanced Vision and Imaging