ASR-free CNN-DTW keyword spotting using multilingual bottleneck features for almost zero-resource languages
Raghav Menon, Herman Kamper, Emre Yilmaz, John Quinn, Thomas Niesler

TL;DR
This paper presents a multilingual CNN-DTW keyword spotting system that leverages bottleneck features from multiple languages to improve performance in nearly zero-resource languages, enabling faster and more accurate detection.
Contribution
It introduces a novel approach combining multilingual bottleneck features with CNN-DTW for keyword spotting in low-resource languages, enhancing accuracy over traditional methods.
Findings
Multilingual BNFs improve ROC AUC by 10.9% over MFCC baseline.
CNN-DTW achieves competitive performance using low-resource supervision.
Combining data from well-resourced languages enhances keyword spotting in under-resourced languages.
Abstract
We consider multilingual bottleneck features (BNFs) for nearly zero-resource keyword spotting. This forms part of a United Nations effort using keyword spotting to support humanitarian relief programmes in parts of Africa where languages are severely under-resourced. We use 1920 isolated keywords (40 types, 34 minutes) as exemplars for dynamic time warping (DTW) template matching, which is performed on a much larger body of untranscribed speech. These DTW costs are used as targets for a convolutional neural network (CNN) keyword spotter, giving a much faster system than direct DTW. Here we consider how available data from well-resourced languages can improve this CNN-DTW approach. We show that multilingual BNFs trained on ten languages improve the area under the ROC curve of a CNN-DTW system by 10.9% absolute relative to the MFCC baseline. By combining low-resource DTW-based supervision…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsDynamic Time Warping
