Speech based Depression Severity Level Classification Using a   Multi-Stage Dilated CNN-LSTM Model

Nadee Seneviratne; Carol Espy-Wilson

arXiv:2104.04195·eess.AS·April 12, 2021

Speech based Depression Severity Level Classification Using a Multi-Stage Dilated CNN-LSTM Model

Nadee Seneviratne, Carol Espy-Wilson

PDF

TL;DR

This paper introduces a multi-stage dilated CNN-LSTM model for classifying depression severity from speech, utilizing articulatory coordination features to achieve more detailed and accurate assessments than binary classification methods.

Contribution

It formulates depression classification as a severity level task and proposes a novel multi-stage CNN-LSTM approach using articulatory coordination features for improved accuracy.

Findings

01

27.47% improvement in session-level classification accuracy using ACFs from TVs

02

Segment-wise classifier performance is enhanced when combined with session-wise classifier

03

ACFs from TVs outperform MFCCs in depression severity classification

Abstract

Speech based depression classification has gained immense popularity over the recent years. However, most of the classification studies have focused on binary classification to distinguish depressed subjects from non-depressed subjects. In this paper, we formulate the depression classification task as a severity level classification problem to provide more granularity to the classification outcomes. We use articulatory coordination features (ACFs) developed to capture the changes of neuromotor coordination that happens as a result of psychomotor slowing, a necessary feature of Major Depressive Disorder. The ACFs derived from the vocal tract variables (TVs) are used to train a dilated Convolutional Neural Network based depression classification model to obtain segment-level predictions. Then, we propose a Recurrent Neural Network based approach to obtain session-level predictions from…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.