Embedding Compression with Isotropic Iterative Quantization

Siyu Liao; Jie Chen; Yanzhi Wang; Qinru Qiu; Bo Yuan

arXiv:2001.05314·cs.CL·January 24, 2020·1 cites

Embedding Compression with Isotropic Iterative Quantization

Siyu Liao, Jie Chen, Yanzhi Wang, Qinru Qiu, Bo Yuan

PDF

Open Access

TL;DR

This paper introduces Isotropic Iterative Quantization (IIQ), a method for compressing word embeddings into binary vectors, achieving over thirty-fold compression with maintained or improved NLP model performance.

Contribution

The paper adapts iterative quantization for embedding compression, ensuring isotropic properties and significantly reducing memory requirements in NLP models.

Findings

01

Over thirty-fold compression ratio achieved

02

Comparable or improved performance over original embeddings

03

Effective on pre-trained GloVe and HDC embeddings

Abstract

Continuous representation of words is a standard component in deep learning-based NLP models. However, representing a large vocabulary requires significant memory, which can cause problems, particularly on resource-constrained platforms. Therefore, in this paper we propose an isotropic iterative quantization (IIQ) approach for compressing embedding vectors into binary ones, leveraging the iterative quantization technique well established for image retrieval, while satisfying the desired isotropic property of PMI based models. Experiments with pre-trained embeddings (i.e., GloVe and HDC) demonstrate a more than thirty-fold compression ratio with comparable and sometimes even improved performance over the original real-valued embedding vectors.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Topic Modeling · Advanced Image and Video Retrieval Techniques

MethodsGloVe Embeddings