Dummy Prototypical Networks for Few-Shot Open-Set Keyword Spotting

Byeonggeun Kim; Seunghan Yang; Inseop Chung; Simyung Chang

arXiv:2206.13691·cs.SD·June 29, 2022

Dummy Prototypical Networks for Few-Shot Open-Set Keyword Spotting

Byeonggeun Kim, Seunghan Yang, Inseop Chung, Simyung Chang

PDF

Open Access

TL;DR

This paper introduces Dummy Prototypical Networks, a novel approach for few-shot open-set keyword spotting that improves open-set detection and is validated on both speech and image benchmarks.

Contribution

The paper proposes Dummy Prototypical Networks, a new metric learning method that enhances open-set detection in few-shot scenarios for keyword spotting and image recognition.

Findings

01

D-ProtoNets outperform recent FSOSR methods on splitGSC.

02

D-ProtoNets achieve state-of-the-art open-set detection on miniImageNet.

03

The approach effectively combines few-shot learning with open-set rejection.

Abstract

Keyword spotting is the task of detecting a keyword in streaming audio. Conventional keyword spotting targets predefined keywords classification, but there is growing attention in few-shot (query-by-example) keyword spotting, e.g., N-way classification given M-shot support samples. Moreover, in real-world scenarios, there can be utterances from unexpected categories (open-set) which need to be rejected rather than classified as one of the N classes. Combining the two needs, we tackle few-shot open-set keyword spotting with a new benchmark setting, named splitGSC. We propose episode-known dummy prototypes based on metric learning to detect an open-set better and introduce a simple and powerful approach, Dummy Prototypical Networks (D-ProtoNets). Our D-ProtoNets shows clear margins compared to recent few-shot open-set recognition (FSOSR) approaches in the suggested splitGSC. We also…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic and Audio Processing · Speech and Audio Processing · Speech Recognition and Synthesis