Tiny-CRNN: Streaming Wakeword Detection In A Low Footprint Setting

Mohammad Omar Khursheed; Christin Jose; Rajath Kumar; Gengshen Fu,; Brian Kulis; Santosh Kumar Cheekatmalla

arXiv:2109.14725·cs.LG·October 1, 2021

Tiny-CRNN: Streaming Wakeword Detection In A Low Footprint Setting

Mohammad Omar Khursheed, Christin Jose, Rajath Kumar, Gengshen Fu,, Brian Kulis, Santosh Kumar Cheekatmalla

PDF

TL;DR

This paper introduces Tiny-CRNN, a low-footprint neural network architecture with attention mechanisms for streaming wakeword detection, achieving significant reductions in false accepts and model size compared to existing models.

Contribution

The paper presents Tiny-CRNN models with attention for wakeword detection, demonstrating improved accuracy and efficiency over CNN and DNN models in low-resource settings.

Findings

01

25% reduction in false accepts with 10% fewer parameters at 250k budget

02

Up to 32% reduction in false accepts at 50k parameter budget

03

Achieves effective streaming inference with reduced latency and errors

Abstract

In this work, we propose Tiny-CRNN (Tiny Convolutional Recurrent Neural Network) models applied to the problem of wakeword detection, and augment them with scaled dot product attention. We find that, compared to Convolutional Neural Network models, False Accepts in a 250k parameter budget can be reduced by 25% with a 10% reduction in parameter size by using models based on the Tiny-CRNN architecture, and we can get up to 32% reduction in False Accepts at a 50k parameter budget with 75% reduction in parameter size compared to word-level Dense Neural Network models. We discuss solutions to the challenging problem of performing inference on streaming audio with this architecture, as well as differences in start-end index errors and latency in comparison to CNN, DNN, and DNN-HMM models.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.