Optimizing Dysarthria Wake-Up Word Spotting: An End-to-End Approach for SLT 2024 LRDWWS Challenge
Shuiyun Liu, Yuxiang Kong, Pengcheng Guo, Weiji Zhuang, Peng Gao,, Yujun Wang, Lei Xie

TL;DR
This paper introduces an end-to-end pretrain-based system for dysarthria wake-up word spotting, improving accuracy and reducing false accepts by combining a multi-task model with a dual-filter strategy, winning the SLT 2024 challenge.
Contribution
The paper presents a novel 2-branch data2vec2 model for joint ASR and wake-up word spotting, along with a dual-filter approach to enhance performance in dysarthria speech recognition.
Findings
Achieved an FAR of 0.00321 and an FRR of 0.005 on test data.
Secured first place in the SLT 2024 LRDWWS Challenge.
Demonstrated the effectiveness of multi-task learning and dual-filter strategy.
Abstract
Speech has emerged as a widely embraced user interface across diverse applications. However, for individuals with dysarthria, the inherent variability in their speech poses significant challenges. This paper presents an end-to-end Pretrain-based Dual-filter Dysarthria Wake-up word Spotting (PD-DWS) system for the SLT 2024 Low-Resource Dysarthria Wake-Up Word Spotting Challenge. Specifically, our system improves performance from two key perspectives: audio modeling and dual-filter strategy. For audio modeling, we propose an innovative 2branch-d2v2 model based on the pre-trained data2vec2 (d2v2), which can simultaneously model automatic speech recognition (ASR) and wake-up word spotting (WWS) tasks through a unified multi-task finetuning paradigm. Additionally, a dual-filter strategy is introduced to reduce the false accept rate (FAR) while maintaining the same false reject rate (FRR).…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVoice and Speech Disorders · Tracheal and airway disorders · Dysphagia Assessment and Management
