Hierarchical Residual Learning Based Vector Quantized Variational   Autoencoder for Image Reconstruction and Generation

Mohammad Adiban; Kalin Stefanov; Sabato Marco Siniscalchi and; Giampiero Salvi

arXiv:2208.04554·cs.CV·August 10, 2022·1 cites

Hierarchical Residual Learning Based Vector Quantized Variational Autoencoder for Image Reconstruction and Generation

Mohammad Adiban, Kalin Stefanov, Sabato Marco Siniscalchi and, Giampiero Salvi

PDF

Open Access 1 Repo

TL;DR

The paper introduces HR-VQVAE, a hierarchical vector quantized autoencoder that learns layered discrete representations for improved image reconstruction and generation, outperforming existing models in quality and diversity.

Contribution

It presents a novel hierarchical residual learning framework with a new objective function, enabling better discrete representations and addressing codebook collapse issues.

Findings

01

Reconstructs high-quality images with less distortion.

02

Generates diverse, high-quality images surpassing state-of-the-art models.

03

Reduces decoding time and scales codebook size effectively.

Abstract

We propose a multi-layer variational autoencoder method, we call HR-VQVAE, that learns hierarchical discrete representations of the data. By utilizing a novel objective function, each layer in HR-VQVAE learns a discrete representation of the residual from previous layers through a vector quantized encoder. Furthermore, the representations at each layer are hierarchically linked to those at previous layers. We evaluate our method on the tasks of image reconstruction and generation. Experimental results demonstrate that the discrete representations learned by HR-VQVAE enable the decoder to reconstruct high-quality images with less distortion than the baseline methods, namely VQVAE and VQVAE-2. HR-VQVAE can also generate high-quality and diverse images that outperform state-of-the-art generative models, providing further verification of the efficiency of the learned representations. The…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

mohammad-adiban/Video-Prediction
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · AI in cancer detection · Cell Image Analysis Techniques

MethodsUSD Coin Customer Service Number +1-833-534-1729 · PixelCNN · VQ-VAE-2 · VQ-VAE