Exploring Filterbank Learning for Keyword Spotting

Iv\'an L\'opez-Espejo; Zheng-Hua Tan; Jesper Jensen

arXiv:2006.00217·eess.AS·June 2, 2020

Exploring Filterbank Learning for Keyword Spotting

Iv\'an L\'opez-Espejo, Zheng-Hua Tan, Jesper Jensen

PDF

TL;DR

This paper investigates learned filterbanks for keyword spotting, comparing them to handcrafted features, and finds no significant accuracy difference, suggesting handcrafted features remain effective but highlighting potential for future research.

Contribution

It explores filterbank learning methods for KWS, including spectral domain and gammachirp filterbanks, and evaluates their effectiveness with neural network back-ends.

Findings

01

No significant accuracy difference between learned and handcrafted features

02

Handcrafted features remain a strong choice for modern KWS

03

Potential information redundancy in features suggests new research directions

Abstract

Despite their great performance over the years, handcrafted speech features are not necessarily optimal for any particular speech application. Consequently, with greater or lesser success, optimal filterbank learning has been studied for different speech processing tasks. In this paper, we fill in a gap by exploring filterbank learning for keyword spotting (KWS). Two approaches are examined: filterbank matrix learning in the power spectral domain and parameter learning of a psychoacoustically-motivated gammachirp filterbank. Filterbank parameters are optimized jointly with a modern deep residual neural network-based KWS back-end. Our experimental results reveal that, in general, there are no statistically significant differences, in terms of KWS accuracy, between using a learned filterbank and handcrafted speech features. Thus, while we conclude that the latter are still a wise choice…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.