Re-ttention: Ultra Sparse Visual Generation via Attention Statistical Reshape

Ruichen Chen; Keith G. Mills; Liyao Jiang; Chao Gao; Di Niu

arXiv:2505.22918·cs.CV·October 30, 2025

Re-ttention: Ultra Sparse Visual Generation via Attention Statistical Reshape

Ruichen Chen, Keith G. Mills, Liyao Jiang, Chao Gao, Di Niu

PDF

1 Repo 1 Video

TL;DR

Re-ttention introduces a highly sparse attention mechanism for visual generation that leverages temporal redundancy to significantly reduce computational complexity while maintaining high visual quality.

Contribution

The paper presents Re-ttention, a novel sparse attention method that reshapes attention scores based on prior distributions to preserve quality at extreme sparsity levels.

Findings

01

Re-ttention achieves as low as 3.1% token usage during inference.

02

Re-ttention outperforms existing sparse attention methods in visual quality.

03

The approach maintains high-quality visual generation with reduced computational cost.

Abstract

Diffusion Transformers (DiT) have become the de-facto model for generating high-quality visual content like videos and images. A huge bottleneck is the attention mechanism where complexity scales quadratically with resolution and video length. One logical way to lessen this burden is sparse attention, where only a subset of tokens or patches are included in the calculation. However, existing techniques fail to preserve visual quality at extremely high sparsity levels and might even incur non-negligible compute overheads. To address this concern, we propose Re-ttention, which implements very high sparse attention for visual generation models by leveraging the temporal redundancy of Diffusion Models to overcome the probabilistic normalization shift within the attention mechanism. Specifically, Re-ttention reshapes attention scores based on the prior softmax distribution history in order…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

cccrrrccc/re-ttention
noneOfficial

Videos

Re-ttention: Ultra Sparse Visual Generation via Attention Statistical Reshape· slideslive

Taxonomy

MethodsAttention Is All You Need · Softmax · Diffusion