Improving Word Embedding Factorization for Compression Using Distilled Nonlinear Neural Decomposition
Vasileios Lioutas, Ahmad Rashid, Krtin Kumar, Md Akmal Haidar and, Mehdi Rezagholizadeh

TL;DR
This paper introduces a novel embedding compression method using low-rank matrix decomposition and knowledge distillation, significantly reducing memory usage while improving translation and language modeling performance.
Contribution
It proposes a simple, effective embedding compression technique that outperforms complex state-of-the-art methods in translation and language modeling tasks.
Findings
Higher BLEU scores in translation tasks
Lower perplexity in language modeling
Effective compression with a single parameter
Abstract
Word-embeddings are vital components of Natural Language Processing (NLP) models and have been extensively explored. However, they consume a lot of memory which poses a challenge for edge deployment. Embedding matrices, typically, contain most of the parameters for language models and about a third for machine translation systems. In this paper, we propose Distilled Embedding, an (input/output) embedding compression method based on low-rank matrix decomposition and knowledge distillation. First, we initialize the weights of our decomposed matrices by learning to reconstruct the full pre-trained word-embedding and then fine-tune end-to-end, employing knowledge distillation on the factorized embedding. We conduct extensive experiments with various compression rates on machine translation and language modeling, using different data-sets with a shared word-embedding matrix for both…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Speech Recognition and Synthesis
MethodsKnowledge Distillation
