Iterative Compression of End-to-End ASR Model using AutoML

Abhinav Mehrotra; {\L}ukasz Dudziak; Jinsu Yeo; Young-yoon Lee,; Ravichander Vipperla; Mohamed S. Abdelfattah; Sourav Bhattacharya; Samin; Ishtiaq; Alberto Gil C. P. Ramos; SangJeong Lee; Daehyun Kim; Nicholas D.; Lane

arXiv:2008.02897·cs.LG·August 11, 2020

Iterative Compression of End-to-End ASR Model using AutoML

Abhinav Mehrotra, {\L}ukasz Dudziak, Jinsu Yeo, Young-yoon Lee,, Ravichander Vipperla, Mohamed S. Abdelfattah, Sourav Bhattacharya, Samin, Ishtiaq, Alberto Gil C. P. Ramos, SangJeong Lee, Daehyun Kim, Nicholas D., Lane

PDF

TL;DR

This paper introduces an iterative AutoML-based low rank factorization method for end-to-end ASR models, achieving over 5x compression without increasing word error rates, surpassing previous AutoML techniques.

Contribution

It presents a novel iterative AutoML approach that extends the compression capabilities of existing AutoML-based LRF methods for ASR models.

Findings

01

Achieves over 5x model compression without WER degradation.

02

Outperforms previous AutoML-based compression methods.

03

Maintains acceptable WER at higher compression levels.

Abstract

Increasing demand for on-device Automatic Speech Recognition (ASR) systems has resulted in renewed interests in developing automatic model compression techniques. Past research have shown that AutoML-based Low Rank Factorization (LRF) technique, when applied to an end-to-end Encoder-Attention-Decoder style ASR model, can achieve a speedup of up to 3.7x, outperforming laborious manual rank-selection approaches. However, we show that current AutoML-based search techniques only work up to a certain compression level, beyond which they fail to produce compressed models with acceptable word error rates (WER). In this work, we propose an iterative AutoML-based LRF approach that achieves over 5x compression without degrading the WER, thereby advancing the state-of-the-art in ASR compression.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.