Metric Learning for Keyword Spotting
Jaesung Huh, Minjae Lee, Heesoo Heo, Seongkyu Mun, Joon Son Chung

TL;DR
This paper introduces a metric learning approach for keyword spotting that improves detection of target keywords while reducing false alarms from unseen non-target sounds, addressing limitations of traditional classifier-based methods.
Contribution
The work proposes a novel metric learning method with per-class weighting for keyword spotting, enhancing detection accuracy and reducing false alarms from unseen sounds.
Findings
Significantly reduces false alarms for unseen non-target keywords.
Maintains high classification accuracy on the Google Speech Commands dataset.
Outperforms traditional classifier-based keyword spotting methods.
Abstract
The goal of this work is to train effective representations for keyword spotting via metric learning. Most existing works address keyword spotting as a closed-set classification problem, where both target and non-target keywords are predefined. Therefore, prevailing classifier-based keyword spotting systems perform poorly on non-target sounds which are unseen during the training stage, causing high false alarm rates in real-world scenarios. In reality, keyword spotting is a detection problem where predefined target keywords are detected from a variety of unknown sounds. This shares many similarities to metric learning problems in that the unseen and unknown non-target sounds must be clearly differentiated from the target keywords. However, a key difference is that the target keywords are known and predefined. To this end, we propose a new method based on metric learning that maximises…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
