Masked Vector Quantization

David D. Nguyen; David Leibowitz; Surya Nepal; Salil S. Kanhere

arXiv:2301.06626·cs.LG·March 26, 2024

Masked Vector Quantization

David D. Nguyen, David Leibowitz, Surya Nepal, Salil S. Kanhere

PDF

Open Access

TL;DR

The paper introduces Masked Vector Quantization (MVQ), a novel framework that enhances discrete latent representations in generative models, significantly improving efficiency and quality on ImageNet with fewer tokens and codebook entries.

Contribution

MVQ increases code vector capacity through mask learning with MH-Dropout, reducing sampling time and improving generative quality in vector quantization architectures.

Findings

01

Up to 68% FID reduction on ImageNet 64x64

02

7-45x faster token sampling during inference

03

Smaller latent spaces enable transferable visual representations

Abstract

Generative models with discrete latent representations have recently demonstrated an impressive ability to learn complex high-dimensional data distributions. However, their performance relies on a long sequence of tokens per instance and a large number of codebook entries, resulting in long sampling times and considerable computation to fit the categorical posterior. To address these issues, we propose the Masked Vector Quantization (MVQ) framework which increases the representational capacity of each code vector by learning mask configurations via a stochastic winner-takes-all training regime called Multiple Hypothese Dropout (MH-Dropout). On ImageNet 64 $\times$ 64, MVQ reduces FID in existing vector quantization architectures by up to $68%$ at 2 tokens per instance and $57%$ at 5 tokens. These improvements widen as codebook entries is reduced and allows for $7 - 45 \times$ …

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAI in cancer detection · Generative Adversarial Networks and Image Synthesis · Advanced Image and Video Retrieval Techniques

MethodsDropout