MFCC-based Recurrent Neural Network for Automatic Clinical Depression Recognition and Assessment from Speech
Emna Rejaibi, Ali Komaty, Fabrice Meriaudeau, Said Agrebi, and Alice, Othmani

TL;DR
This paper introduces a deep recurrent neural network framework using MFCC and other features to detect and assess depression from speech, achieving high accuracy and low error rates, suitable for real-time clinical applications.
Contribution
It presents a novel deep learning approach combining MFCC-based features and transfer learning to improve depression recognition and severity prediction from speech data.
Findings
Outperforms state-of-the-art on DAIC-WOZ with 76.27% accuracy
Achieves 0.4 RMSE in depression assessment
Boosts classification accuracy by up to 20% with additional features
Abstract
Clinical depression or Major Depressive Disorder (MDD) is a common and serious medical illness. In this paper, a deep recurrent neural network-based framework is presented to detect depression and to predict its severity level from speech. Low-level and high-level audio features are extracted from audio recordings to predict the 24 scores of the Patient Health Questionnaire and the binary class of depression diagnosis. To overcome the problem of the small size of Speech Depression Recognition (SDR) datasets, expanding training labels and transferred features are considered. The proposed approach outperforms the state-of-art approaches on the DAIC-WOZ database with an overall accuracy of 76.27% and a root mean square error of 0.4 in assessing depression, while a root mean square error of 0.168 is achieved in predicting the depression severity levels. The proposed framework has several…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
