Beyond Stationarity: Rethinking Codebook Collapse in Vector Quantization
Hao Lu, Onur C. Koyun, Yongxin Guo, Zhengjie Zhu, Abbas Alili, Metin Nafi Gurcan

TL;DR
This paper identifies the nonstationary encoder updates as the cause of codebook collapse in vector quantization and proposes two methods, NSVQ and TransVQ, to improve codebook utilization and reconstruction quality.
Contribution
It introduces a new theoretical explanation for codebook collapse and proposes two novel methods, NSVQ and TransVQ, to address this issue in vector quantization.
Findings
Both methods achieve near-complete codebook utilization.
They demonstrate superior reconstruction quality over baseline VQ methods.
Experiments validate the effectiveness of the proposed approaches.
Abstract
Vector Quantization (VQ) underpins many modern generative frameworks such as VQ-VAE, VQ-GAN, and latent diffusion models. Yet, it suffers from the persistent problem of codebook collapse, where a large fraction of code vectors remains unused during training. This work provides a new theoretical explanation by identifying the nonstationary nature of encoder updates as the fundamental cause of this phenomenon. We show that as the encoder drifts, unselected code vectors fail to receive updates and gradually become inactive. To address this, we propose two new methods: Non-Stationary Vector Quantization (NSVQ), which propagates encoder drift to non-selected codes through a kernel-based rule, and Transformer-based Vector Quantization (TransVQ), which employs a lightweight mapping to adaptively transform the entire codebook while preserving convergence to the k-means solution. Experiments on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Domain Adaptation and Few-Shot Learning · Single-cell and spatial transcriptomics
