Predicting Individual Depression Symptoms from Acoustic Features During Speech
Sebastian Rodriguez, Sri Harsha Dumpala, Katerina Dikaios, Sheri, Rempel, Rudolf Uher, Sageev Oore

TL;DR
This paper explores using acoustic speech features with neural networks to predict individual depression symptoms, aiming to improve understanding and diagnosis of depression through detailed item-level analysis.
Contribution
It introduces a method to predict individual depression items from speech using CNN and LSTM models, incorporating temporal context and voting schemes for enhanced accuracy.
Findings
Neural networks can predict individual depression symptoms from speech.
Temporal context learning improves prediction accuracy.
Voting schemes influence the reliability of depression detection.
Abstract
Current automatic depression detection systems provide predictions directly without relying on the individual symptoms/items of depression as denoted in the clinical depression rating scales. In contrast, clinicians assess each item in the depression rating scale in a clinical setting, thus implicitly providing a more detailed rationale for a depression diagnosis. In this work, we make a first step towards using the acoustic features of speech to predict individual items of the depression rating scale before obtaining the final depression prediction. For this, we use convolutional (CNN) and recurrent (long short-term memory (LSTM)) neural networks. We consider different approaches to learning the temporal context of speech. Further, we analyze two variants of voting schemes for individual item prediction and depression detection. We also include an animated visualization that shows an…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEmotion and Mood Recognition
