WeKws: A production first small-footprint end-to-end Keyword Spotting   Toolkit

Jie Wang; Menglong Xu; Jingyong Hou; Binbin Zhang; Xiao-Lei Zhang; Lei; Xie; Fuping Pan

arXiv:2210.16743·eess.AS·November 1, 2022

WeKws: A production first small-footprint end-to-end Keyword Spotting Toolkit

Jie Wang, Menglong Xu, Jingyong Hou, Binbin Zhang, Xiao-Lei Zhang, Lei, Xie, Fuping Pan

PDF

Open Access 1 Repo

TL;DR

WeKws is a production-ready, efficient end-to-end keyword spotting toolkit that simplifies training and deployment, achieving competitive results on multiple datasets for real-world speech interaction applications.

Contribution

It introduces a practical, easy-to-use E2E KWS toolkit with a refined max-pooling loss for better keyword boundary detection, bridging research and deployment gaps.

Findings

01

Achieves high accuracy on three public datasets.

02

Simplifies training with a refined max-pooling loss.

03

Enables efficient real-world deployment.

Abstract

Keyword spotting (KWS) enables speech-based user interaction and gradually becomes an indispensable component of smart devices. Recently, end-to-end (E2E) methods have become the most popular approach for on-device KWS tasks. However, there is still a gap between the research and deployment of E2E KWS methods. In this paper, we introduce WeKws, a production-quality, easy-to-build, and convenient-to-be-applied E2E KWS toolkit. WeKws contains the implementations of several state-of-the-art backbone networks, making it achieve highly competitive results on three publicly available datasets. To make WeKws a pure E2E toolkit, we utilize a refined max-pooling loss to make the model learn the ending position of the keyword by itself, which significantly simplifies the training pipeline and makes WeKws very efficient to be applied in real-world scenarios. The toolkit is publicly available at…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

wenet-e2e/wekws
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsICT in Developing Communities · Speech and dialogue systems · Topic Modeling