Robust and Explainable Depression Identification from Speech Using   Vowel-Based Ensemble Learning Approaches

Kexin Feng; Theodora Chaspari

arXiv:2410.18298·cs.LG·October 25, 2024

Robust and Explainable Depression Identification from Speech Using Vowel-Based Ensemble Learning Approaches

Kexin Feng, Theodora Chaspari

PDF

Open Access

TL;DR

This paper presents vowel-based ensemble learning methods for depression detection from speech, emphasizing explainability and robustness, with approaches that decompose symptoms and severity for improved clinical utility.

Contribution

It introduces novel vowel-based embeddings and ensemble strategies that enhance explainability and robustness in depression classification from speech data.

Findings

01

Performance comparable to state-of-the-art baselines

02

Enhanced robustness against dataset mean/median variations

03

Improved system explainability for clinical use

Abstract

This study investigates explainable machine learning algorithms for identifying depression from speech. Grounded in evidence from speech production that depression affects motor control and vowel generation, pre-trained vowel-based embeddings, that integrate semantically meaningful linguistic units, are used. Following that, an ensemble learning approach decomposes the problem into constituent parts characterized by specific depression symptoms and severity levels. Two methods are explored: a "bottom-up" approach with 8 models predicting individual Patient Health Questionnaire-8 (PHQ-8) item scores, and a "top-down" approach using a Mixture of Experts (MoE) with a router module for assessing depression severity. Both methods depict performance comparable to state-of-the-art baselines, demonstrating robustness and reduced susceptibility to dataset mean/median values. System…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsEmotion and Mood Recognition · Speech Recognition and Synthesis · Sentiment Analysis and Opinion Mining