Draft-and-Revise: Effective Image Generation with Contextual RQ-Transformer
Doyup Lee, Chiheon Kim, Saehoon Kim, Minsu Cho, Wook-Shin Han

TL;DR
This paper introduces a novel image generation framework called Draft-and-Revise with Contextual RQ-Transformer, which considers global image context during generation, leading to state-of-the-art results in conditional image synthesis.
Contribution
The paper proposes a new Draft-and-Revise decoding method with Contextual RQ-Transformer that effectively incorporates global context and improves image quality and diversity.
Findings
Achieves state-of-the-art results on conditional image generation.
Effectively controls quality-diversity trade-off in image synthesis.
Demonstrates improved global context reflection in generated images.
Abstract
Although autoregressive models have achieved promising results on image generation, their unidirectional generation process prevents the resultant images from fully reflecting global contexts. To address the issue, we propose an effective image generation framework of Draft-and-Revise with Contextual RQ-transformer to consider global contexts during the generation process. As a generalized VQ-VAE, RQ-VAE first represents a high-resolution image as a sequence of discrete code stacks. After code stacks in the sequence are randomly masked, Contextual RQ-Transformer is trained to infill the masked code stacks based on the unmasked contexts of the image. Then, Contextual RQ-Transformer uses our two-phase decoding, Draft-and-Revise, and generates an image, while exploiting the global contexts of the image during the generation process. Specifically. in the draft phase, our model first focuses…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Advanced Image and Video Retrieval Techniques · Advanced Vision and Imaging
MethodsVQ-VAE
