Frequency-Aware Autoregressive Modeling for Efficient High-Resolution Image Synthesis
Zhuokun Chen, Jugang Fan, Zhuowei Yu, Bohan Zhuang, Mingkui Tan

TL;DR
This paper introduces SparseVAR, a framework that accelerates high-resolution image synthesis by dynamically excluding low-frequency tokens during autoregressive inference, significantly reducing computation while maintaining image quality.
Contribution
SparseVAR is a novel, plug-and-play acceleration method that identifies and excludes low-frequency tokens in autoregressive image models without additional training.
Findings
Achieves up to 2x speedup in high-resolution image synthesis.
Maintains high image quality with minimal degradation.
Reduces computational overhead in high-resolution stages.
Abstract
Visual autoregressive modeling, based on the next-scale prediction paradigm, exhibits notable advantages in image quality and model scalability over traditional autoregressive and diffusion models. It generates images by progressively refining resolution across multiple stages. However, the computational overhead in high-resolution stages remains a critical challenge due to the substantial number of tokens involved. In this paper, we introduce SparseVAR, a plug-and-play acceleration framework for next-scale prediction that dynamically excludes low-frequency tokens during inference without requiring additional training. Our approach is motivated by the observation that tokens in low-frequency regions have a negligible impact on image quality in high-resolution stages and exhibit strong similarity with neighboring tokens. Additionally, we observe that different blocks in the next-scale…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Advanced Image Processing Techniques · Computer Graphics and Visualization Techniques
