Attention-Free Keyword Spotting

Mashrur M. Morshed; Ahmad Omar Ahsan

arXiv:2110.07749·cs.LG·April 12, 2022

Attention-Free Keyword Spotting

Mashrur M. Morshed, Ahmad Omar Ahsan

PDF

Open Access 1 Repo

TL;DR

This paper investigates replacing attention mechanisms with gated MLPs for keyword spotting, demonstrating that MLP-based models can achieve competitive accuracy with significantly fewer parameters.

Contribution

It introduces a family of efficient MLP-based models for keyword spotting that outperform attention-based models in parameter efficiency.

Findings

01

MLP-based models achieve competitive accuracy on Google Speech Commands benchmarks.

02

Models have less than 0.5 million parameters, significantly fewer than attention-based counterparts.

03

The approach demonstrates the viability of attention-free models for speech recognition tasks.

Abstract

Till now, attention-based models have been used with great success in the keyword spotting problem domain. However, in light of recent advances in deep learning, the question arises whether self-attention is truly irreplaceable for recognizing speech keywords. We thus explore the usage of gated MLPs --previously shown to be alternatives to transformers in vision tasks-- for the keyword spotting task. We provide a family of highly efficient MLP-based models for keyword spotting, with less than 0.5 million parameters. We show that our approach achieves competitive performance on Google Speech Commands V2-12 and V2-35 benchmarks with much fewer parameters than self-attention-based methods.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

AI-Research-BD/Keyword-MLP
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Music and Audio Processing · Topic Modeling