Not All Image Regions Matter: Masked Vector Quantization for Autoregressive Image Generation
Mengqi Huang, Zhendong Mao, Quan Wang, Yongdong Zhang

TL;DR
This paper introduces a novel two-stage image generation framework that uses importance perception to mask redundant regions during quantization, improving efficiency and preserving important image structures.
Contribution
It proposes Masked Quantization VAE and Stackformer, incorporating adaptive masking to reduce redundancy and enhance the quality of autoregressive image generation.
Findings
Reduces training cost and speeds up generation.
Maintains important image structures during quantization.
Improves overall image generation quality.
Abstract
Existing autoregressive models follow the two-stage generation paradigm that first learns a codebook in the latent space for image reconstruction and then completes the image generation autoregressively based on the learned codebook. However, existing codebook learning simply models all local region information of images without distinguishing their different perceptual importance, which brings redundancy in the learned codebook that not only limits the next stage's autoregressive model's ability to model important structure but also results in high training cost and slow generation speed. In this study, we borrow the idea of importance perception from classical image coding theory and propose a novel two-stage framework, which consists of Masked Quantization VAE (MQ-VAE) and Stackformer, to relieve the model from modeling redundancy. Specifically, MQ-VAE incorporates an adaptive mask…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image and Video Retrieval Techniques · Generative Adversarial Networks and Image Synthesis · Image Retrieval and Classification Techniques
