Multitaper mel-spectrograms for keyword spotting

Douglas Baptista de Souza; Khaled Jamal Bakri; Fernanda Ferreira,; Juliana Inacio

arXiv:2407.04662·eess.AS·July 8, 2024

Multitaper mel-spectrograms for keyword spotting

Douglas Baptista de Souza, Khaled Jamal Bakri, Fernanda Ferreira,, Juliana Inacio

PDF

Open Access

TL;DR

This paper explores the use of multitaper techniques to enhance feature extraction in keyword spotting, demonstrating improved performance across various test scenarios and neural network models.

Contribution

It introduces a novel application of multitaper methods for feature extraction in KWS, which has been underexplored compared to model architecture innovations.

Findings

01

Multitaper features outperform traditional features in KWS tasks.

02

Improved features show robustness across different datasets and neural network architectures.

03

Experimental results confirm the advantages of multitaper features in embedded KWS applications.

Abstract

Keyword spotting (KWS) is one of the speech recognition tasks most sensitive to the quality of the feature representation. However, the research on KWS has traditionally focused on new model topologies, putting little emphasis on other aspects like feature extraction. This paper investigates the use of the multitaper technique to create improved features for KWS. The experimental study is carried out for different test scenarios, windows and parameters, datasets, and neural networks commonly used in embedded KWS applications. Experiment results confirm the advantages of using the proposed improved features.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Text Analysis Techniques