Representation Collapsing Problems in Vector Quantization
Wenhao Zhao, Qiran Zou, Rushi Shah, Dianbo Liu

TL;DR
This paper investigates the phenomenon of representation collapse in vector quantization within generative models, identifying causes and proposing mitigation strategies to improve the diversity and discriminative power of learned representations.
Contribution
It is the first comprehensive study to analyze representation collapsing problems in vector quantization, revealing causes and suggesting solutions.
Findings
Restricted initialization leads to codebook token collapse
Limited encoder capacity causes embedding collapse
Proposed mitigation strategies can reduce collapse severity
Abstract
Vector quantization is a technique in machine learning that discretizes continuous representations into a set of discrete vectors. It is widely employed in tokenizing data representations for large language models, diffusion models, and other generative models. Despite its prevalence, the characteristics and behaviors of vector quantization in generative models remain largely underexplored. In this study, we investigate representation collapse in vector quantization - a critical degradation where codebook tokens or latent embeddings lose their discriminative power by converging to a limited subset of values. This collapse fundamentally compromises the model's ability to capture diverse data patterns. By leveraging both synthetic and real datasets, we identify the severity of each type of collapses and triggering conditions. Our analysis reveals that restricted initialization and limited…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPhotonic and Optical Devices · Mathematical Analysis and Transform Methods
MethodsSparse Evolutionary Training · Diffusion
