Autoregressive Image Generation Needs Only a Few Lines of Cached Tokens
Ziran Qin, Youru Lv, Mingbao Lin, Zeren Zhang, Chanfan Gan, Tieyuan Chen, Weiyao Lin

TL;DR
LineAR introduces a cache compression method for autoregressive image generation that reduces memory usage and increases speed while maintaining or improving image quality, applicable across various models.
Contribution
A training-free, line-level cache management technique that efficiently compresses visual tokens for autoregressive image generation, enhancing performance and scalability.
Findings
Reduces memory by up to 67.61% and speeds up generation by 7.57x.
Maintains or improves image generation quality across multiple models.
Validates effectiveness on diverse datasets and models.
Abstract
Autoregressive (AR) visual generation has emerged as a powerful paradigm for image and multimodal synthesis, owing to its scalability and generality. However, existing AR image generation suffers from severe memory bottlenecks due to the need to cache all previously generated visual tokens during decoding, leading to both high storage requirements and low throughput. In this paper, we introduce \textbf{LineAR}, a novel, training-free progressive key-value (KV) cache compression pipeline for autoregressive image generation. By fully exploiting the intrinsic characteristics of visual attention, LineAR manages the cache at the line level using a 2D view, preserving the visual dependency regions while progressively evicting less-informative tokens that are harmless for subsequent line generation, guided by inter-line attention. LineAR enables efficient autoregressive (AR) image generation…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Image Enhancement Techniques · Computer Graphics and Visualization Techniques
