Sparsification via Compressed Sensing for Automatic Speech Recognition

Kai Zhen (1; 2); Hieu Duy Nguyen (2); Feng-Ju Chang (2); Athanasios; Mouchtaris (2); and Ariya Rastrow (2). ((1) Indiana University Bloomington,; (2) Alexa Machine Learning; Amazon; USA)

arXiv:2102.04932·cs.LG·February 10, 2021

Sparsification via Compressed Sensing for Automatic Speech Recognition

Kai Zhen (1, 2), Hieu Duy Nguyen (2), Feng-Ju Chang (2), Athanasios, Mouchtaris (2), and Ariya Rastrow (2). ((1) Indiana University Bloomington,, (2) Alexa Machine Learning, Amazon, USA)

PDF

TL;DR

This paper introduces a compressed sensing based pruning method for automatic speech recognition models, effectively reducing model size and latency while maintaining accuracy, outperforming existing pruning techniques.

Contribution

It proposes a novel CSP approach that integrates compressed sensing into model training for improved sparse pruning in ASR models.

Findings

01

CSP outperforms existing pruning methods in ASR tasks.

02

The approach reduces model size and latency effectively.

03

It maintains high accuracy despite aggressive sparsification.

Abstract

In order to achieve high accuracy for machine learning (ML) applications, it is essential to employ models with a large number of parameters. Certain applications, such as Automatic Speech Recognition (ASR), however, require real-time interactions with users, hence compelling the model to have as low latency as possible. Deploying large scale ML applications thus necessitates model quantization and compression, especially when running ML models on resource constrained devices. For example, by forcing some of the model weight values into zero, it is possible to apply zero-weight compression, which reduces both the model size and model reading time from the memory. In the literature, such methods are referred to as sparse pruning. The fundamental questions are when and which weights should be forced to zero, i.e. be pruned. In this work, we propose a compressed sensing based pruning (CSP)…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsPruning