Metric Learning for User-defined Keyword Spotting
Jaemin Jung, Youkyum Kim, Jihwan Park, Youshin Lim, Byeong-Yeol Kim,, Youngjoon Jang, Joon Son Chung

TL;DR
This paper introduces a metric learning approach for user-defined keyword spotting that enhances the detection of new spoken terms without incremental training, improving performance and providing a standardized evaluation protocol.
Contribution
It presents a novel metric learning-based training strategy, a large-scale keyword dataset with filtering, and a unified evaluation protocol for user-defined keyword spotting.
Findings
Improved detection accuracy on Google Speech Commands dataset
Effective representation enrichment for unseen keywords
Outperforms previous methods significantly
Abstract
The goal of this work is to detect new spoken terms defined by users. While most previous works address Keyword Spotting (KWS) as a closed-set classification problem, this limits their transferability to unseen terms. The ability to define custom keywords has advantages in terms of user experience. In this paper, we propose a metric learning-based training strategy for user-defined keyword spotting. In particular, we make the following contributions: (1) we construct a large-scale keyword dataset with an existing speech corpus and propose a filtering method to remove data that degrade model training; (2) we propose a metric learning-based two-stage training strategy, and demonstrate that the proposed method improves the performance on the user-defined keyword spotting task by enriching their representations; (3) to facilitate the fair comparison in the user-defined KWS field, we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Music and Audio Processing · Speech and dialogue systems
