Nested AutoRegressive Models
Hongyu Wu, Xuhui Fan, Zhangkai Wu, Longbing Cao

TL;DR
NestAR introduces a hierarchical nested autoregressive architecture for image generation, reducing complexity and increasing diversity while maintaining competitive quality.
Contribution
The paper proposes a novel nested AR architecture with multi-scale modules, lowering computational complexity from linear to logarithmic and enhancing image diversity.
Findings
Reduces generation complexity from O(n) to O(log n)
Achieves competitive image quality with lower computational cost
Enhances image diversity compared to existing AR models
Abstract
AutoRegressive (AR) models have demonstrated competitive performance in image generation, achieving results comparable to those of diffusion models. However, their token-by-token image generation mechanism remains computationally intensive and existing solutions such as VAR often lead to limited sample diversity. In this work, we propose a Nested AutoRegressive~(NestAR) model, which proposes nested AutoRegressive architectures in generating images. NestAR designs multi-scale modules in a hierarchical order. These different scaled modules are constructed in an AR architecture, where one larger-scale module is conditioned on outputs from its previous smaller-scale module. Within each module, NestAR uses another AR structure to generate ``patches'' of tokens. The proposed nested AR architecture reduces the overall complexity from to in generating …
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
