HQ-VAE: Hierarchical Discrete Representation Learning with Variational Bayes
Yuhta Takida, Yukara Ikemiya, Takashi Shibuya, Kazuki Shimada, Woosung, Choi, Chieh-Hsin Lai, Naoki Murata, Toshimitsu Uesaka, Kengo Uchida,, Wei-Hsiang Liao, Yuki Mitsufuji

TL;DR
HQ-VAE introduces a Bayesian hierarchical discrete representation learning framework that improves codebook utilization and reconstruction quality in autoencoders, applicable across image and audio data.
Contribution
It proposes a novel stochastic hierarchical discrete representation model, HQ-VAE, that generalizes existing VQ-VAE variants with a Bayesian training scheme to address codebook collapse.
Findings
Enhanced codebook usage demonstrated on image datasets.
Improved reconstruction accuracy over existing hierarchical VQ-VAE models.
Validated applicability to audio data.
Abstract
Vector quantization (VQ) is a technique to deterministically learn features with discrete codebook representations. It is commonly performed with a variational autoencoding model, VQ-VAE, which can be further extended to hierarchical structures for making high-fidelity reconstructions. However, such hierarchical extensions of VQ-VAE often suffer from the codebook/layer collapse issue, where the codebook is not efficiently used to express the data, and hence degrades reconstruction accuracy. To mitigate this problem, we propose a novel unified framework to stochastically learn hierarchical discrete representation on the basis of the variational Bayes framework, called hierarchically quantized variational autoencoder (HQ-VAE). HQ-VAE naturally generalizes the hierarchical variants of VQ-VAE, such as VQ-VAE-2 and residual-quantized VAE (RQ-VAE), and provides them with a Bayesian training…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImage and Signal Denoising Methods · AI in cancer detection · Cancer-related molecular mechanisms research
MethodsPixelCNN · VQ-VAE · VQ-VAE-2
