Toward Knowledge-Driven Speech-Based Models of Depression: Leveraging   Spectrotemporal Variations in Speech Vowels

Kexin Feng; Theodora Chaspari

arXiv:2210.02527·cs.LG·October 7, 2022·1 cites

Toward Knowledge-Driven Speech-Based Models of Depression: Leveraging Spectrotemporal Variations in Speech Vowels

Kexin Feng, Theodora Chaspari

PDF

Open Access 1 Repo

TL;DR

This paper presents a knowledge-driven machine learning approach that leverages spectrotemporal vowel-level speech features to improve depression detection and enhance interpretability for clinical applications.

Contribution

It introduces a novel vowel-based spectrotemporal modeling framework combined with explainability methods for depression detection from speech.

Findings

01

Outperforms baseline models without vowel-level integration

02

Spectrotemporal vowel information is more impactful than non-vowel segments

03

Provides interpretable insights into temporal speech changes related to depression

Abstract

Psychomotor retardation associated with depression has been linked with tangible differences in vowel production. This paper investigates a knowledge-driven machine learning (ML) method that integrates spectrotemporal information of speech at the vowel-level to identify the depression. Low-level speech descriptors are learned by a convolutional neural network (CNN) that is trained for vowel classification. The temporal evolution of those low-level descriptors is modeled at the high-level within and across utterances via a long short-term memory (LSTM) model that takes the final depression decision. A modified version of the Local Interpretable Model-agnostic Explanations (LIME) is further used to identify the impact of the low-level spectrotemporal vowel variation on the decisions and observe the high-level temporal change of the depression likelihood. The proposed method outperforms…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

hubbs-lab-tamu/2dcnn-lstm-depression-identification
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVoice and Speech Disorders