PB-LRDWWS System for the SLT 2024 Low-Resource Dysarthria Wake-Up Word   Spotting Challenge

Shiyao Wang; Jiaming Zhou; Shiwan Zhao; Yong Qin

arXiv:2409.04799·cs.SD·December 9, 2024

PB-LRDWWS System for the SLT 2024 Low-Resource Dysarthria Wake-Up Word Spotting Challenge

Shiyao Wang, Jiaming Zhou, Shiwan Zhao, Yong Qin

PDF

Open Access 1 Repo

TL;DR

The paper presents a prototype-based dysarthric speech recognition system using a fine-tuned HuBERT model, achieving second place in the SLT 2024 Low-Resource Dysarthria Wake-Up Word Spotting Challenge.

Contribution

It introduces a novel combination of a fine-tuned HuBERT feature extractor with prototype-based classification for dysarthric speech recognition.

Findings

01

Achieved second place in the LRDWWS Challenge

02

Effective prototype-based classification for low-resource dysarthric speech

03

Demonstrated simplicity and effectiveness of the approach

Abstract

For the SLT 2024 Low-Resource Dysarthria Wake-Up Word Spotting (LRDWWS) Challenge, we introduce the PB-LRDWWS system. This system combines a dysarthric speech content feature extractor for prototype construction with a prototype-based classification method. The feature extractor is a fine-tuned HuBERT model obtained through a three-stage fine-tuning process using cross-entropy loss. This fine-tuned HuBERT extracts features from the target dysarthric speaker's enrollment speech to build prototypes. Classification is achieved by calculating the cosine similarity between the HuBERT features of the target dysarthric speaker's evaluation speech and prototypes. Despite its simplicity, our method demonstrates effectiveness through experimental results. Our system achieves second place in the final Test-B of the LRDWWS Challenge.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

nku-hlt/pb-dsr
pytorch

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Voice and Speech Disorders · Phonetics and Phonology Research