Progressive Continual Learning for Spoken Keyword Spotting

Yizheng Huang; Nana Hou; Nancy F. Chen

arXiv:2201.12546·cs.CL·February 8, 2022

Progressive Continual Learning for Spoken Keyword Spotting

Yizheng Huang, Nana Hou, Nancy F. Chen

PDF

Open Access 2 Repos

TL;DR

This paper introduces PCL-KWS, a progressive continual learning framework for spoken keyword spotting that effectively learns new keywords sequentially without forgetting previous ones, maintaining high accuracy with minimal model growth.

Contribution

The paper proposes a novel progressive continual learning strategy with task-specific sub-networks and keyword-aware scaling, enabling incremental learning in KWS without catastrophic forgetting.

Findings

01

Achieves 92.8% average accuracy on Google Speech Command dataset.

02

Outperforms existing baselines in continual learning for KWS.

03

Maintains high performance with constrained model growth.

Abstract

Catastrophic forgetting is a thorny challenge when updating keyword spotting (KWS) models after deployment. To tackle such challenges, we propose a progressive continual learning strategy for small-footprint spoken keyword spotting (PCL-KWS). Specifically, the proposed PCL-KWS framework introduces a network instantiator to generate the task-specific sub-networks for remembering previously learned keywords. As a result, the PCL-KWS approach incrementally learns new keywords without forgetting prior knowledge. Besides, the keyword-aware network scaling mechanism of PCL-KWS constrains the growth of model parameters while achieving high performance. Experimental results show that after learning five new tasks sequentially, our proposed PCL-KWS approach archives the new state-of-the-art performance of 92.8% average accuracy for all the tasks on Google Speech Command dataset compared with…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Music and Audio Processing · Speech and Audio Processing