Progressive Text-to-Image Generation
Zhengcong Fei, Mingyuan Fan, Li Zhu, Junshi Huang

TL;DR
This paper introduces a progressive, coarse-to-fine text-to-image generation model that improves image quality and inference speed over traditional VQ-AR methods by creating images in a hierarchical manner with an error revision mechanism.
Contribution
The paper proposes a novel progressive, hierarchical generation approach with parallel processing and error correction, enhancing image quality and efficiency in text-to-image synthesis.
Findings
Significantly better FID scores on MS COCO benchmark.
Over 13 times faster inference with minimal performance loss.
Effective coarse-to-fine image generation process.
Abstract
Recently, Vector Quantized AutoRegressive (VQ-AR) models have shown remarkable results in text-to-image synthesis by equally predicting discrete image tokens from the top left to bottom right in the latent space. Although the simple generative process surprisingly works well, is this the best way to generate the image? For instance, human creation is more inclined to the outline-to-fine of an image, while VQ-AR models themselves do not consider any relative importance of image patches. In this paper, we present a progressive model for high-fidelity text-to-image generation. The proposed method takes effect by creating new image tokens from coarse to fine based on the existing context in a parallel manner, and this procedure is recursively applied with the proposed error revision mechanism until an image sequence is completed. The resulting coarse-to-fine hierarchy makes the image…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image and Video Retrieval Techniques · Generative Adversarial Networks and Image Synthesis · Image Retrieval and Classification Techniques
