HASTE: Training-Free Video Diffusion Acceleration via Head-Wise Adaptive Sparse Attention

Xuzhe Zheng; Yuexiao Ma; Jing Xu; Xiawu Zheng; Rongrong Ji; Fei Chao

arXiv:2605.14513·cs.CV·May 15, 2026

HASTE: Training-Free Video Diffusion Acceleration via Head-Wise Adaptive Sparse Attention

Xuzhe Zheng, Yuexiao Ma, Jing Xu, Xiawu Zheng, Rongrong Ji, Fei Chao

PDF

TL;DR

HASTE introduces a training-free, head-wise adaptive sparse attention method for video diffusion models, significantly accelerating inference while preserving quality.

Contribution

It proposes a novel head-wise adaptive framework with mask reuse and error-guided calibration to improve speed-quality trade-offs in sparse attention.

Findings

01

Achieves up to 1.93x speedup at 720P resolution.

02

Maintains competitive video quality and similarity metrics.

03

Improves efficiency of pretrained video diffusion models.

Abstract

Diffusion-based video generation has advanced substantially in visual fidelity and temporal coherence, but practical deployment remains limited by the quadratic complexity of full attention. Training-free sparse attention is attractive because it accelerates pretrained models without retraining, yet existing online top- $p$ sparse attention still spends non-negligible cost on mask prediction and applies shared thresholds despite strong head-level heterogeneity. We show that these two overlooked factors limit the practical speed-quality trade-off of training-free sparse attention in Video DiTs. To address them, we introduce a head-wise adaptive framework with two plug-in components: Temporal Mask Reuse, which skips unnecessary mask prediction based on query-key drift, and Error-guided Budgeted Calibration, which assigns per-head top- $p$ thresholds by minimizing measured model-output error…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.