Prediction of Depression Severity Based on the Prosodic and Semantic Features with Bidirectional LSTM and Time Distributed CNN
Kaining Mao, Wei Zhang, Deborah Baofeng Wang, Ang Li, Rongqi Jiao,, Yanhui Zhu, Bin Wu, Tiansheng Zheng, Lei Qian, Wei Lyu, Minjie Ye, Jie Chen

TL;DR
This paper presents an attention-based multimodal system combining speech and text features with Bi-LSTM and CNN for accurate depression severity prediction, outperforming previous methods.
Contribution
It introduces a novel multimodal depression prediction model using Bi-LSTM, T-CNN, and GloVe embeddings, demonstrating significant improvements over prior approaches.
Findings
Audio and text models achieve high F1 scores in depression severity estimation.
Multimodal fusion yields the best patient-level depression detection performance.
The proposed approach significantly outperforms previous methods.
Abstract
Depression is increasingly impacting individuals both physically and psychologically worldwide. It has become a global major public health problem and attracts attention from various research fields. Traditionally, the diagnosis of depression is formulated through semi-structured interviews and supplementary questionnaires, which makes the diagnosis heavily relying on physicians experience and is subject to bias. Mental health monitoring and cloud-based remote diagnosis can be implemented through an automated depression diagnosis system. In this article, we propose an attention-based multimodality speech and text representation for depression prediction. Our model is trained to estimate the depression severity of participants using the Distress Analysis Interview Corpus-Wizard of Oz (DAIC-WOZ) dataset. For the audio modality, we use the collaborative voice analysis repository (COVAREP)…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsMemory Network
