XQ-GAN: An Open-source Image Tokenization Framework for Autoregressive Generation
Xiang Li, Kai Qiu, Hao Chen, Jason Kuen, Jiuxiang Gu, Jindong Wang,, Zhe Lin, Bhiksha Raj

TL;DR
XQ-GAN introduces a versatile image tokenization framework utilizing advanced quantization methods, significantly improving image reconstruction and generation quality on ImageNet, and providing pre-trained models for community use.
Contribution
The paper presents XQ-GAN, a flexible framework integrating multiple state-of-the-art quantization techniques for enhanced image tokenization in generative models.
Findings
Achieves an rFID of 0.64 on ImageNet 256x256, surpassing previous models.
Improves gFID metrics when used as a tokenizer, e.g., gFID of 2.6 with VAR.
Provides pre-trained tokenizers for community research and development.
Abstract
Image tokenizers play a critical role in shaping the performance of subsequent generative models. Since the introduction of VQ-GAN, discrete image tokenization has undergone remarkable advancements. Improvements in architecture, quantization techniques, and training recipes have significantly enhanced both image reconstruction and the downstream generation quality. In this paper, we present XQ-GAN, an image tokenization framework designed for both image reconstruction and generation tasks. Our framework integrates state-of-the-art quantization techniques, including vector quantization (VQ), residual quantization (RQ), multi-scale residual quantization (MSVQ), product quantization (PQ), lookup-free quantization (LFQ), and binary spherical quantization (BSQ), within a highly flexible and customizable training environment. On the standard ImageNet 256x256 benchmark, our released model…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image and Video Retrieval Techniques · Generative Adversarial Networks and Image Synthesis · Image Retrieval and Classification Techniques
