Lightweight Adaptation of Neural Language Models via Subspace Embedding
Amit Kumar Jaiswal, Haiming Liu

TL;DR
This paper introduces a compact subspace embedding method for neural language models that significantly reduces memory usage with minimal accuracy loss, enabling efficient deployment on resource-constrained devices.
Contribution
The authors propose a novel subspace embedding structure that compresses language model embeddings by over 99.8%, maintaining performance on key NLP tasks.
Findings
Achieves over 99.8% compression rate compared to original embeddings.
Maintains competitive accuracy on XNLI and GLUE benchmarks.
Effective for multilingual and masked language models.
Abstract
Traditional neural word embeddings are usually dependent on a richer diversity of vocabulary. However, the language models recline to cover major vocabularies via the word embedding parameters, in particular, for multilingual language models that generally cover a significant part of their overall learning parameters. In this work, we present a new compact embedding structure to reduce the memory footprint of the pre-trained language models with a sacrifice of up to 4% absolute accuracy. The embeddings vectors reconstruction follows a set of subspace embeddings and an assignment procedure via the contextual relationship among tokens from pre-trained language models. The subspace embedding structure calibrates to masked language models, to evaluate our compact embedding structure on similarity and textual entailment tasks, sentence and paraphrase tasks. Our experimental evaluation shows…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Speech Recognition and Synthesis
