Variational Auto-Encoder Based Variability Encoding for Dysarthric   Speech Recognition

Xurong Xie; Rukiye Ruzi; Xunying Liu; Lan Wang

arXiv:2201.09422·eess.AS·June 17, 2024

Variational Auto-Encoder Based Variability Encoding for Dysarthric Speech Recognition

Xurong Xie, Rukiye Ruzi, Xunying Liu, Lan Wang

PDF

TL;DR

This paper introduces a variational auto-encoder based variability encoder (VAEVE) to explicitly model and encode acoustic variability in dysarthric speech, improving recognition accuracy.

Contribution

The novel VAEVE method explicitly encodes phoneme-independent variability using a variational auto-encoder, enhancing dysarthric speech recognition performance.

Findings

01

VAEVE encodings improve word error rates (WER) by up to 2.2%.

02

VAEVE provides complementary information to existing speaker adaptation methods.

03

Systems with VAEVE outperform baselines without variability encoding.

Abstract

Dysarthric speech recognition is a challenging task due to acoustic variability and limited amount of available data. Diverse conditions of dysarthric speakers account for the acoustic variability, which make the variability difficult to be modeled precisely. This paper presents a variational auto-encoder based variability encoder (VAEVE) to explicitly encode such variability for dysarthric speech. The VAEVE makes use of both phoneme information and low-dimensional latent variable to reconstruct the input acoustic features, thereby the latent variable is forced to encode the phoneme-independent variability. Stochastic gradient variational Bayes algorithm is applied to model the distribution for generating variability encodings, which are further used as auxiliary features for DNN acoustic modeling. Experiment results conducted on the UASpeech corpus show that the VAEVE based variability…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsStochastic Gradient Variational Bayes