MixAR: Mixture Autoregressive Image Generation

Jinyuan Hu; Jiayou Zhang; Shaobo Cui; Kun Zhang; Guangyi Chen

arXiv:2511.12181·cs.CV·November 18, 2025

MixAR: Mixture Autoregressive Image Generation

Jinyuan Hu, Jiayou Zhang, Shaobo Cui, Kun Zhang, Guangyi Chen

PDF

Open Access

TL;DR

MixAR introduces a novel mixture training framework that combines discrete tokens and continuous autoregressive modeling to improve image generation quality, addressing limitations of quantization and unstructured continuous spaces.

Contribution

The paper proposes MixAR, a new framework that integrates discrete token guidance with continuous autoregressive models for enhanced image generation.

Findings

01

DC-Mix offers a good balance between efficiency and fidelity.

02

TI-Mix improves training and inference consistency.

03

MixAR achieves higher quality image generation.

Abstract

Autoregressive (AR) approaches, which represent images as sequences of discrete tokens from a finite codebook, have achieved remarkable success in image generation. However, the quantization process and the limited codebook size inevitably discard fine-grained information, placing bottlenecks on fidelity. Motivated by this limitation, recent studies have explored autoregressive modeling in continuous latent spaces, which offers higher generation quality. Yet, unlike discrete tokens constrained by a fixed codebook, continuous representations lie in a vast and unstructured space, posing significant challenges for efficient autoregressive modeling. To address these challenges, we introduce MixAR, a novel framework that leverages mixture training paradigms to inject discrete tokens as prior guidance for continuous AR modeling. MixAR is a factorized formulation that leverages discrete tokens…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Domain Adaptation and Few-Shot Learning · Face recognition and analysis