Learning a Neural Diff for Speech Models
Jonathan Macoskey, Grant P. Strimel, Ariya Rastrow

TL;DR
This paper introduces neural update methods for speech models that enable efficient over-the-network transmission, allowing successive model updates within data constraints and outperforming traditional compression techniques.
Contribution
It proposes two architecture-agnostic neural update techniques for compact model transmission, addressing resource constraints in edge speech processing.
Findings
Budgeted updates outperform compression baselines in ASR and SLU tasks.
Methods effectively transmit successive speech model updates within data budgets.
Experimental results validate the efficiency of the proposed approaches.
Abstract
As more speech processing applications execute locally on edge devices, a set of resource constraints must be considered. In this work we address one of these constraints, namely over-the-network data budgets for transferring models from server to device. We present neural update approaches for release of subsequent speech model generations abiding by a data budget. We detail two architecture-agnostic methods which learn compact representations for transmission to devices. We experimentally validate our techniques with results on two tasks (automatic speech recognition and spoken language understanding) on open source data sets by demonstrating when applied in succession, our budgeted updates outperform comparable model compression baselines by significant margins.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
