FlowAR: Scale-wise Autoregressive Image Generation Meets Flow Matching
Sucheng Ren, Qihang Yu, Ju He, Xiaohui Shen, Alan Yuille, Liang-Chieh, Chen

TL;DR
FlowAR introduces a simplified, scalable autoregressive image generation method that improves generalization and flexibility by using a straightforward scale doubling approach and flow matching, outperforming previous models on ImageNet-256.
Contribution
FlowAR proposes a streamlined scale-wise autoregressive model that replaces complex multi-scale residual tokenizers with a simple doubling scale design, enabling better generalization and modularity.
Findings
Outperforms previous methods on ImageNet-256 benchmark
Simplifies scale design for improved generalization
Facilitates integration with flow matching for high-quality synthesis
Abstract
Autoregressive (AR) modeling has achieved remarkable success in natural language processing by enabling models to generate text with coherence and contextual understanding through next token prediction. Recently, in image generation, VAR proposes scale-wise autoregressive modeling, which extends the next token prediction to the next scale prediction, preserving the 2D structure of images. However, VAR encounters two primary challenges: (1) its complex and rigid scale design limits generalization in next scale prediction, and (2) the generator's dependence on a discrete tokenizer with the same complex scale structure restricts modularity and flexibility in updating the tokenizer. To address these limitations, we introduce FlowAR, a general next scale prediction method featuring a streamlined scale design, where each subsequent scale is simply double the previous one. This eliminates the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMedical Image Segmentation Techniques · Generative Adversarial Networks and Image Synthesis · Advanced Neural Network Applications
