Learning Low-Rank Representations for Model Compression
Zezhou Zhu, Yucong Zhou, Zhao Zhong

TL;DR
This paper introduces LR²VQ, a low-rank representation vector quantization method that enhances model compression by combining dimensionality reduction with clustering, leading to significant accuracy improvements over existing VQ techniques.
Contribution
The paper proposes LR²VQ, a novel end-to-end trainable vector quantization method that integrates low-rank approximation with subvector clustering for improved model compression.
Findings
Achieves 2.8%/1.0% top-1 accuracy improvements on ImageNet with ResNet-18/ResNet-50.
Provides a theoretical framework for estimating the optimal clustering dimensionality.
Demonstrates superior performance over previous VQ-based compression algorithms.
Abstract
Vector Quantization (VQ) is an appealing model compression method to obtain a tiny model with less accuracy loss. While methods to obtain better codebooks and codes under fixed clustering dimensionality have been extensively studied, optimizations of the vectors in favour of clustering performance are not carefully considered, especially via the reduction of vector dimensionality. This paper reports our recent progress on the combination of dimensionality compression and vector quantization, proposing a Low-Rank Representation Vector Quantization () method that outperforms previous VQ algorithms in various tasks and architectures. joins low-rank representation with subvector clustering to construct a new kind of building block that is directly optimized through end-to-end training over the task loss. Our proposed design pattern introduces…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMedical Image Segmentation Techniques · Sparse and Compressive Sensing Techniques · Advanced Data Compression Techniques
