# ShrinkML: End-to-End ASR Model Compression Using Reinforcement Learning

**Authors:** {\L}ukasz Dudziak, Mohamed S. Abdelfattah, Ravichander Vipperla,, Stefanos Laskaridis, Nicholas D. Lane

arXiv: 1907.03540 · 2019-09-25

## TL;DR

This paper introduces ShrinkML, an AutoML system using reinforcement learning to optimize layer-wise compression of end-to-end ASR models with SVD, achieving effective compression without retraining and better results with retraining.

## Contribution

The paper presents a novel RL-based AutoML approach for layer-wise ASR model compression using SVD, with practical considerations and improved compression results.

## Key findings

- RL-based AutoML effectively compresses ASR models without retraining.
- AutoML outperforms manual rank selection in model compression.
- Compressed models maintain high accuracy on LibriSpeech.

## Abstract

End-to-end automatic speech recognition (ASR) models are increasingly large and complex to achieve the best possible accuracy. In this paper, we build an AutoML system that uses reinforcement learning (RL) to optimize the per-layer compression ratios when applied to a state-of-the-art attention based end-to-end ASR model composed of several LSTM layers. We use singular value decomposition (SVD) low-rank matrix factorization as the compression method. For our RL-based AutoML system, we focus on practical considerations such as the choice of the reward/punishment functions, the formation of an effective search space, and the creation of a representative but small data set for quick evaluation between search steps. Finally, we present accuracy results on LibriSpeech of the model compressed by our AutoML system, and we compare it to manually-compressed models. Our results show that in the absence of retraining our RL-based search is an effective and practical method to compress a production-grade ASR system. When retraining is possible, we show that our AutoML system can select better highly-compressed seed models compared to manually hand-crafted rank selection, thus allowing for more compression than previously possible.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1907.03540/full.md

## Figures

9 figures with captions in the complete paper: https://tomesphere.com/paper/1907.03540/full.md

## References

27 references — full list in the complete paper: https://tomesphere.com/paper/1907.03540/full.md

---
Source: https://tomesphere.com/paper/1907.03540