Deep Neural Network for Automatic Assessment of Dysphonia
Mario Alejandro Garc\'ia, Ana Lorena Rosset

TL;DR
This paper presents a deep neural network that automatically assesses the severity of dysphonia, achieving performance comparable to human raters in perceptual evaluation.
Contribution
It introduces a neural network model focused on amplitude, frequency, and noise features for dysphonia assessment, matching human intra-rater accuracy.
Findings
Neural network performance close to human intra-rater reliability
Outperforms inter-rater variability in dysphonia assessment
Effective in predicting perceptual severity on GRBAS scale
Abstract
The purpose of this work is to contribute to the understanding and improvement of deep neural networks in the field of vocal quality. A neural network that predicts the perceptual assessment of overall severity of dysphonia in GRBAS scale is obtained. The design focuses on amplitude perturbations, frequency perturbations, and noise. Results are compared with performance of human raters on the same data. Both the precision and the mean absolute error of the neural network are close to human intra-rater performance, exceeding inter-rater performance.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and Audio Processing · Voice and Speech Disorders · Music and Audio Processing
