Wavelets to the Rescue: Improving Sample Quality of Latent Variable Deep Generative Models
Prashnna K Gyawali, Rudra Saha, Linwei Wang, VSR Veeravasarapu and, Maneesh Singh

TL;DR
This paper introduces a wavelet space VAE that models images in wavelet coefficient space, leading to higher quality image generation with better detail preservation and competitive FID scores compared to GANs.
Contribution
The novel wavelet space VAE emphasizes high-frequency details by modeling wavelet coefficients, improving sample quality over traditional VAEs.
Findings
Achieves better FID scores than standard VAEs on natural image datasets.
Generates images with higher detail and less blurriness.
Retains disentangled and informative latent representations.
Abstract
Variational Autoencoders (VAE) are probabilistic deep generative models underpinned by elegant theory, stable training processes, and meaningful manifold representations. However, they produce blurry images due to a lack of explicit emphasis over high-frequency textural details of the images, and the difficulty to directly model the complex joint probability distribution over the high-dimensional image space. In this work, we approach these two challenges with a novel wavelet space VAE that uses the decoder to model the images in the wavelet coefficient space. This enables the VAE to emphasize over high-frequency components within an image obtained via wavelet decomposition. Additionally, by decomposing the complex function of generating high-dimensional images into inverse wavelet transformation and generation of wavelet coefficients, the latter becomes simpler to model by the VAE. We…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Image Retrieval and Classification Techniques · Music and Audio Processing
MethodsConvolution · USD Coin Customer Service Number +1-833-534-1729 · Dogecoin Customer Service Number +1-833-534-1729
