Honk: A PyTorch Reimplementation of Convolutional Neural Networks for   Keyword Spotting

Raphael Tang; Jimmy Lin

arXiv:1710.06554·cs.CL·November 29, 2017·34 cites

Honk: A PyTorch Reimplementation of Convolutional Neural Networks for Keyword Spotting

Raphael Tang, Jimmy Lin

PDF

Open Access 4 Repos

TL;DR

Honk is an open-source PyTorch implementation of CNN models for keyword spotting, providing a comparable accuracy baseline on Google's Speech Commands Dataset to facilitate future research in speech command recognition.

Contribution

It offers a PyTorch reimplementation of CNN-based keyword spotting models originally in TensorFlow, aiding reproducibility and further development.

Findings

01

Comparable accuracy to original TensorFlow models

02

Provides a practical PyTorch baseline for keyword spotting

03

Facilitates future research in speech command recognition

Abstract

We describe Honk, an open-source PyTorch reimplementation of convolutional neural networks for keyword spotting that are included as examples in TensorFlow. These models are useful for recognizing "command triggers" in speech-based interfaces (e.g., "Hey Siri"), which serve as explicit cues for audio recordings of utterances that are sent to the cloud for full speech recognition. Evaluation on Google's recently released Speech Commands Dataset shows that our reimplementation is comparable in accuracy and provides a starting point for future work on the keyword spotting task.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Text Analysis Techniques