Automatic Assessment of Dysarthria Using Audio-visual Vowel Graph   Attention Network

Xiaokang Liu; Xiaoxia Du; Juan Liu; Rongfeng Su; Manwa Lawrence Ng,; Yumei Zhang; Yudong Yang; Shaofeng Zhao; Lan Wang; Nan Yan

arXiv:2405.03254·eess.AS·May 8, 2024

Automatic Assessment of Dysarthria Using Audio-visual Vowel Graph Attention Network

Xiaokang Liu, Xiaoxia Du, Juan Liu, Rongfeng Su, Manwa Lawrence Ng,, Yumei Zhang, Yudong Yang, Shaofeng Zhao, Lan Wang, Nan Yan

PDF

Open Access

TL;DR

This paper introduces an audio-visual vowel graph attention network that combines expert acoustical features, deep learning representations, and visual cues to improve the automatic assessment of dysarthria, demonstrating superior regression performance.

Contribution

It presents a novel neural network architecture integrating expert knowledge, deep learning, and visual information for dysarthria assessment, enhancing interpretability and accuracy.

Findings

01

Outperforms existing methods in Frenchay score regression

02

Effectively combines acoustical features and deep learning representations

03

Incorporates visual information to improve robustness

Abstract

Automatic assessment of dysarthria remains a highly challenging task due to high variability in acoustic signals and the limited data. Currently, research on the automatic assessment of dysarthria primarily focuses on two approaches: one that utilizes expert features combined with machine learning, and the other that employs data-driven deep learning methods to extract representations. Research has demonstrated that expert features are effective in representing pathological characteristics, while deep learning methods excel at uncovering latent features. Therefore, integrating the advantages of expert features and deep learning to construct a neural network architecture based on expert knowledge may be beneficial for interpretability and assessment performance. In this context, the present paper proposes a vowel graph attention network based on audio-visual information, which…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVoice and Speech Disorders