Teaching keyword spotters to spot new keywords with limited examples

Abhijeet Awasthi; Kevin Kilgour; Hassan Rom

arXiv:2106.02443·eess.AS·June 7, 2021

Teaching keyword spotters to spot new keywords with limited examples

Abhijeet Awasthi, Kevin Kilgour, Hassan Rom

PDF

TL;DR

This paper introduces KeySEM, a speech embedding model pre-trained on keyword recognition, enabling rapid and effective learning of new keywords from limited examples, suitable for personalized and on-device keyword spotting.

Contribution

We propose KeySEM, a novel pre-trained speech embedding model that improves few-shot keyword learning and generalizes across languages without re-training on previous keywords.

Findings

01

KeySEM outperforms existing methods with fewer examples.

02

It generalizes well to multiple languages.

03

It allows sequential learning of new keywords without re-training.

Abstract

Learning to recognize new keywords with just a few examples is essential for personalizing keyword spotting (KWS) models to a user's choice of keywords. However, modern KWS models are typically trained on large datasets and restricted to a small vocabulary of keywords, limiting their transferability to a broad range of unseen keywords. Towards easily customizable KWS models, we present KeySEM (Keyword Speech EMbedding), a speech embedding model pre-trained on the task of recognizing a large number of keywords. Speech representations offered by KeySEM are highly effective for learning new keywords from a limited number of examples. Comparisons with a diverse range of related work across several datasets show that our method achieves consistently superior performance with fewer training examples. Although KeySEM was pre-trained only on English utterances, the performance gains also extend…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.