On the Compression of Natural Language Models
Saeed Damadi

TL;DR
This paper investigates the existence of sparse, trainable subnetworks within large natural language models, reviewing current compression techniques like quantization, distillation, and pruning to improve interpretability and efficiency.
Contribution
It assesses the applicability of the lottery ticket hypothesis to natural language models, exploring whether sparse subnetworks can be trained to match full model performance.
Findings
Sparse subnetworks can potentially be found in NLMs
Compression techniques improve model efficiency
Interpretability of NLMs can be enhanced
Abstract
Deep neural networks are effective feature extractors but they are prohibitively large for deployment scenarios. Due to the huge number of parameters, interpretability of parameters in different layers is not straight-forward. This is why neural networks are sometimes considered black boxes. Although simpler models are easier to explain, finding them is not easy. If found, a sparse network that can fit to a data from scratch would help to interpret parameters of a neural network. To this end, lottery ticket hypothesis states that typical dense neural networks contain a small sparse sub-network that can be trained to a reach similar test accuracy in an equal number of steps. The goal of this work is to assess whether such a trainable subnetwork exists for natural language models (NLM)s. To achieve this goal we will review state-of-the-art compression techniques such as quantization,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Topic Modeling · Machine Learning and Data Classification
