Megapixel Image Generation with Step-Unrolled Denoising Autoencoders
Alex F. McKinney, Chris G. Willcocks

TL;DR
This paper introduces a scalable, efficient framework combining VQ-GAN, hourglass transformers, and SUNDAE autoencoders to generate high-resolution, diverse, and realistic megapixel images rapidly on consumer hardware.
Contribution
It presents a novel combination of techniques and modifications that enable megapixel image generation with fast sampling, high scalability, and flexible capabilities, surpassing prior methods in efficiency and resolution.
Findings
Achieved high-resolution ($1024 imes 1024$) image generation in 2-4 days.
Produced diverse, realistic megapixel samples in about 2 seconds on a GTX 1080Ti.
Obtained competitive FID scores close to state-of-the-art with fewer sampling steps.
Abstract
An ongoing trend in generative modelling research has been to push sample resolutions higher whilst simultaneously reducing computational requirements for training and sampling. We aim to push this trend further via the combination of techniques - each component representing the current pinnacle of efficiency in their respective areas. These include vector-quantized GAN (VQ-GAN), a vector-quantization (VQ) model capable of high levels of lossy - but perceptually insignificant - compression; hourglass transformers, a highly scaleable self-attention model; and step-unrolled denoising autoencoders (SUNDAE), a non-autoregressive (NAR) text generative model. Unexpectedly, our method highlights weaknesses in the original formulation of hourglass transformers when applied to multidimensional data. In light of this, we propose modifications to the resampling mechanism, applicable in any task…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Image Retrieval and Classification Techniques · AI in cancer detection
MethodsInpainting
