Deep Neural Network for Automatic Assessment of Dysphonia

Mario Alejandro Garc\'ia; Ana Lorena Rosset

arXiv:2202.12957·eess.AS·March 1, 2022·6 cites

Deep Neural Network for Automatic Assessment of Dysphonia

Mario Alejandro Garc\'ia, Ana Lorena Rosset

PDF

Open Access

TL;DR

This paper presents a deep neural network that automatically assesses the severity of dysphonia, achieving performance comparable to human raters in perceptual evaluation.

Contribution

It introduces a neural network model focused on amplitude, frequency, and noise features for dysphonia assessment, matching human intra-rater accuracy.

Findings

01

Neural network performance close to human intra-rater reliability

02

Outperforms inter-rater variability in dysphonia assessment

03

Effective in predicting perceptual severity on GRBAS scale

Abstract

The purpose of this work is to contribute to the understanding and improvement of deep neural networks in the field of vocal quality. A neural network that predicts the perceptual assessment of overall severity of dysphonia in GRBAS scale is obtained. The design focuses on amplitude perturbations, frequency perturbations, and noise. Results are compared with performance of human raters on the same data. Both the precision and the mean absolute error of the neural network are close to human intra-rater performance, exceeding inter-rater performance.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and Audio Processing · Voice and Speech Disorders · Music and Audio Processing