Speech Disorder Classification Using Extended Factorized Hierarchical   Variational Auto-encoders

Jinzi Qi; Hugo Van hamme

arXiv:2106.07337·eess.AS·June 15, 2021·Interspeech

Speech Disorder Classification Using Extended Factorized Hierarchical Variational Auto-encoders

Jinzi Qi, Hugo Van hamme

PDF

Open Access

TL;DR

This paper proposes an extended Factorized Hierarchical Variational Auto-encoder to improve speech disorder classification by disentangling content and sequence information in disordered speech representations.

Contribution

It introduces an extended FHVAE model that better separates content and sequence features for improved disorder classification from limited data.

Findings

01

Extended FHVAE improves disentanglement of speech features.

02

Both content and sequence representations are necessary for optimal classification.

03

Aggregation at word and sentence levels enhances performance.

Abstract

Objective speech disorder classification for speakers with communication difficulty is desirable for diagnosis and administering therapy. With the current state of speech technology, it is evident to propose neural networks for this application. But neural network model training is hampered by a lack of labeled disordered speech data. In this research, we apply an extended version of Factorized Hierarchical Variational Auto-encoders (FHVAE) for representation learning on disordered speech. The FHVAE model extracts both content-related and sequence-related latent variables from speech data, and we utilize the extracted variables to explore how disorder type information is represented in the latent variables. For better classification performance, the latent variables are aggregated at the word and sentence level. We show that an extension of the FHVAE model succeeds in the better…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Voice and Speech Disorders · Phonetics and Phonology Research