Statistical Model Compression for Small-Footprint Natural Language   Understanding

Grant P. Strimel; Kanthashree Mysore Sathyendra; Stanislav Peshterliev

arXiv:1807.07520·cs.CL·July 20, 2018·5 cites

Statistical Model Compression for Small-Footprint Natural Language Understanding

Grant P. Strimel, Kanthashree Mysore Sathyendra, Stanislav Peshterliev

PDF

Open Access

TL;DR

This paper introduces statistical model compression techniques, parameter quantization and perfect feature hashing, to significantly reduce the memory footprint of NLU models while maintaining performance.

Contribution

It presents two novel compression methods that complement existing pruning strategies, enabling efficient small-footprint NLU models for offline and cloud applications.

Findings

01

Achieves 14-fold reduction in memory usage

02

Maintains minimal impact on predictive performance

03

Applicable to large-scale NLU systems

Abstract

In this paper we investigate statistical model compression applied to natural language understanding (NLU) models. Small-footprint NLU models are important for enabling offline systems on hardware restricted devices, and for decreasing on-demand model loading latency in cloud-based systems. To compress NLU models, we present two main techniques, parameter quantization and perfect feature hashing. These techniques are complementary to existing model pruning strategies such as L1 regularization. We performed experiments on a large scale NLU system. The results show that our approach achieves 14-fold reduction in memory usage compared to the original models with minimal predictive performance impact.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Algorithms and Data Compression