LL-VQ-VAE: Learnable Lattice Vector-Quantization For Efficient Representations
Ahmed Khalil, Robert Piechocki, Raul Santos-Rodriguez

TL;DR
This paper introduces LL-VQ-VAE, a learnable lattice vector quantization method that improves discrete representation learning by reducing errors, speeding up training, and ensuring high codebook utilization, demonstrated on multiple datasets.
Contribution
It proposes a novel lattice-based discretization layer for VQ-VAE that enhances efficiency, stability, and scalability of learning discrete representations.
Findings
Lower reconstruction errors compared to VQ-VAE
Faster training times
High codebook utilization and stability
Abstract
In this paper we introduce learnable lattice vector quantization and demonstrate its effectiveness for learning discrete representations. Our method, termed LL-VQ-VAE, replaces the vector quantization layer in VQ-VAE with lattice-based discretization. The learnable lattice imposes a structure over all discrete embeddings, acting as a deterrent against codebook collapse, leading to high codebook utilization. Compared to VQ-VAE, our method obtains lower reconstruction errors under the same training conditions, trains in a fraction of the time, and with a constant number of parameters (equal to the embedding dimension ), making it a very scalable approach. We demonstrate these results on the FFHQ-1024 dataset and include FashionMNIST and Celeb-A.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image and Video Retrieval Techniques · Face and Expression Recognition · Advanced Data Compression Techniques
MethodsVQ-VAE
