OnDA: On-device Channel Pruning for Efficient Personalized Keyword Spotting
Matteo Risso, Alessio Burrello, Daniele Jahier Pagliari

TL;DR
This paper introduces OnDA, a novel on-device channel pruning method that combines weight and architectural adaptation for personalized keyword spotting, significantly reducing model size and resource consumption while maintaining accuracy.
Contribution
It pioneers the integration of online structured channel pruning with weight adaptation for personalized on-device KWS, enabling efficient model compression and resource savings.
Findings
Achieves up to 9.63x model size compression at iso-task performance.
Reduces latency and energy consumption by over 1.5x during online training and inference.
Demonstrates effectiveness on HeySnips, HeySnapdragon datasets, and Jetson Orin Nano hardware.
Abstract
Always-on keyword spotting (KWS) demands on-device adaptation to cope with user- and environment-specific distribution shifts under tight latency and energy budgets. This paper proposes, for the first time, coupling weight adaptation (i.e., on-device training) with architectural adaptation, in the form of online structured channel pruning, for personalized on-device KWS. Starting from a state-of-the-art self-learning personalized KWS pipeline, we compare data-agnostic and data-aware pruning criteria applied on in-field pseudo-labelled user data. On the HeySnips and HeySnapdragon datasets, we achieve up to 9.63x model-size compression with respect to unpruned baselines at iso-task performance, measured as the accuracy at 0.5 false alarms per hour. When deploying our adaptation pipeline on a Jetson Orin Nano embedded GPU, we achieve up to 1.52x/1.57x and 1.64x/1.77x latency and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEEG and Brain-Computer Interfaces · Advanced Text Analysis Techniques · Topic Modeling
