SparVAR: Exploring Sparsity in Visual AutoRegressive Modeling for Training-Free Acceleration

Zekun Li; Ning Wang; Tongxin Bai; Changwang Mei; Peisong Wang; Shuang Qiu; Jian Cheng

arXiv:2602.04361·cs.CV·March 31, 2026

SparVAR: Exploring Sparsity in Visual AutoRegressive Modeling for Training-Free Acceleration

Zekun Li, Ning Wang, Tongxin Bai, Changwang Mei, Peisong Wang, Shuang Qiu, Jian Cheng

PDF

1 Repo

TL;DR

SparVAR is a training-free framework that accelerates visual autoregressive modeling by exploiting sparsity in attention mechanisms, significantly reducing inference time while preserving image quality.

Contribution

It introduces a novel sparse attention prediction method and efficient kernel implementation for large-scale VAR, achieving over 5x speed-up without sacrificing details.

Findings

01

Reduces generation time of 8B models to 1 second for high-res images.

02

Achieves 1.57x speed-up over FlashAttention while maintaining image quality.

03

Up to 2.28x acceleration when combined with scale-skipping strategies.

Abstract

Visual AutoRegressive (VAR) modeling has garnered significant attention for its innovative next-scale prediction paradigm. However, mainstream VAR paradigms attend to all tokens across historical scales at each autoregressive step. As the next scale resolution grows, the computational complexity of attention increases quartically with resolution, causing substantial latency. Prior accelerations often skip high-resolution scales, which speeds up inference but discards high-frequency details and harms image quality. To address these problems, we present \textbf{SparVAR}, a training-free acceleration framework that exploits three properties of VAR attention: \textbf{(i) strong attention sinks}, \textbf{(ii) cross-scale activation similarity}, and \textbf{(iii) pronounced locality}. Specifically, we dynamically predict the sparse attention pattern of later high-resolution scales from a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

CAS-CLab/SparVAR
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.