Early Quantization Shrinks Codebook: A Simple Fix for Diversity-Preserving Tokenization

Wenhao Zhao; Qiran Zou; Rushi Shah; Yudi Wu; Zhouhan Lin; Dianbo Liu

arXiv:2603.17052·cs.LG·March 19, 2026

Early Quantization Shrinks Codebook: A Simple Fix for Diversity-Preserving Tokenization

Wenhao Zhao, Qiran Zou, Rushi Shah, Yudi Wu, Zhouhan Lin, Dianbo Liu

PDF

Open Access

TL;DR

This paper investigates the collapse issues in vector quantization used in generative models, identifying causes like random initialization and limited encoder capacity, and proposes solutions to mitigate these collapses.

Contribution

It provides the first comprehensive analysis of representation collapses in vector quantization and offers potential fixes for these issues.

Findings

01

Identified severity and conditions of codebook and embedding collapses

02

Random initialization and limited encoder capacity cause collapses

03

Proposed solutions mitigate collapse problems

Abstract

Vector quantization is a technique in machine learning that discretizes continuous representations into a set of discrete vectors. It is widely employed in tokenizing data representations for large language models, diffusion models, and other generative models. Despite its prevalence, the characteristics and behaviors of vector quantization in generative models remain largely underexplored. In this study, we systematically investigate the issue of collapses in vector quantization, where collapsed representations are observed across discrete codebook tokens and continuous latent embeddings. By leveraging both synthetic and real datasets, we identify the severity of each type of collapses and triggering conditions. Our analysis reveals that random initialization and limited encoder capacity result in tokens collapse and embeddings collapse. Building on these findings, we propose potential…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsLanguage and cultural evolution · Generative Adversarial Networks and Image Synthesis · Natural Language Processing Techniques