Learning a Neural Diff for Speech Models

Jonathan Macoskey; Grant P. Strimel; Ariya Rastrow

arXiv:2108.01561·eess.AS·August 18, 2021

Learning a Neural Diff for Speech Models

Jonathan Macoskey, Grant P. Strimel, Ariya Rastrow

PDF

TL;DR

This paper introduces neural update methods for speech models that enable efficient over-the-network transmission, allowing successive model updates within data constraints and outperforming traditional compression techniques.

Contribution

It proposes two architecture-agnostic neural update techniques for compact model transmission, addressing resource constraints in edge speech processing.

Findings

01

Budgeted updates outperform compression baselines in ASR and SLU tasks.

02

Methods effectively transmit successive speech model updates within data budgets.

03

Experimental results validate the efficiency of the proposed approaches.

Abstract

As more speech processing applications execute locally on edge devices, a set of resource constraints must be considered. In this work we address one of these constraints, namely over-the-network data budgets for transferring models from server to device. We present neural update approaches for release of subsequent speech model generations abiding by a data budget. We detail two architecture-agnostic methods which learn compact representations for transmission to devices. We experimentally validate our techniques with results on two tasks (automatic speech recognition and spoken language understanding) on open source data sets by demonstrating when applied in succession, our budgeted updates outperform comparable model compression baselines by significant margins.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.