Multitaper mel-spectrograms for keyword spotting
Douglas Baptista de Souza, Khaled Jamal Bakri, Fernanda Ferreira,, Juliana Inacio

TL;DR
This paper explores the use of multitaper techniques to enhance feature extraction in keyword spotting, demonstrating improved performance across various test scenarios and neural network models.
Contribution
It introduces a novel application of multitaper methods for feature extraction in KWS, which has been underexplored compared to model architecture innovations.
Findings
Multitaper features outperform traditional features in KWS tasks.
Improved features show robustness across different datasets and neural network architectures.
Experimental results confirm the advantages of multitaper features in embedded KWS applications.
Abstract
Keyword spotting (KWS) is one of the speech recognition tasks most sensitive to the quality of the feature representation. However, the research on KWS has traditionally focused on new model topologies, putting little emphasis on other aspects like feature extraction. This paper investigates the use of the multitaper technique to create improved features for KWS. The experimental study is carried out for different test scenarios, windows and parameters, datasets, and neural networks commonly used in embedded KWS applications. Experiment results confirm the advantages of using the proposed improved features.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Text Analysis Techniques
