Attention Sparsity is Input-Stable: Training-Free Sparse Attention for Video Generation via Offline Sparsity Profiling and Online QK Co-Clustering

Jiayi Luo; Jiayu Chen; Jiankun Wang; Cong Wang; Hanxin Zhu; Qingyun Sun; Chen Gao; Zhibo Chen; Jianxin Li

arXiv:2603.18636·cs.CV·May 11, 2026

Attention Sparsity is Input-Stable: Training-Free Sparse Attention for Video Generation via Offline Sparsity Profiling and Online QK Co-Clustering

Jiayi Luo, Jiayu Chen, Jiankun Wang, Cong Wang, Hanxin Zhu, Qingyun Sun, Chen Gao, Zhibo Chen, Jianxin Li

PDF

1 Repo

TL;DR

This paper introduces SVOO, a training-free sparse attention method for video generation that leverages layer-wise attention sparsity and bidirectional co-clustering to improve efficiency without sacrificing quality.

Contribution

SVOO is a novel framework that uses offline layer profiling and online co-clustering for sparse attention, addressing layer heterogeneity and query-key coupling issues.

Findings

01

Achieves up to 1.93x speedup in video generation.

02

Maintains PSNR of up to 29 dB on Wan2.1.

03

Outperforms state-of-the-art sparse attention methods.

Abstract

Diffusion Transformers (DiTs) achieve strong video generation quality but suffer from high inference cost due to dense 3D attention, motivating sparse attention techniques for improving efficiency. However, existing training-free sparse attention methods for video generation still face two unresolved limitations: ignoring layer heterogeneity in attention pruning and ignoring query-key coupling in block partitioning, which hinder a better quality-speedup trade-off. In this work, we uncover a critical insight: attention sparsity is an intrinsic layer-wise property, with only minor variation across different inputs. Motivated by this observation, we propose SVOO, a training-free sparse attention framework for fast video generation via offline layer-wise sparsity profiling and online bidirectional co-clustering. Specifically, SVOO adopts a two-stage paradigm: (i) offline layer-wise…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Mutual-Luo/SVOO
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.