Honk: A PyTorch Reimplementation of Convolutional Neural Networks for Keyword Spotting
Raphael Tang, Jimmy Lin

TL;DR
Honk is an open-source PyTorch implementation of CNN models for keyword spotting, providing a comparable accuracy baseline on Google's Speech Commands Dataset to facilitate future research in speech command recognition.
Contribution
It offers a PyTorch reimplementation of CNN-based keyword spotting models originally in TensorFlow, aiding reproducibility and further development.
Findings
Comparable accuracy to original TensorFlow models
Provides a practical PyTorch baseline for keyword spotting
Facilitates future research in speech command recognition
Abstract
We describe Honk, an open-source PyTorch reimplementation of convolutional neural networks for keyword spotting that are included as examples in TensorFlow. These models are useful for recognizing "command triggers" in speech-based interfaces (e.g., "Hey Siri"), which serve as explicit cues for audio recordings of utterances that are sent to the cloud for full speech recognition. Evaluation on Google's recently released Speech Commands Dataset shows that our reimplementation is comparable in accuracy and provides a starting point for future work on the keyword spotting task.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Text Analysis Techniques
