Depression Severity Estimation from Multiple Modalities
Evgeny Stepanov, Stephane Lathuiliere, Shammur Absar Chowdhury,, Arindam Ghosh, Radu-Laurentiu Vieriu, Nicu Sebe, Giuseppe Riccardi

TL;DR
This paper develops an automatic system to estimate depression severity using multimodal data including speech, language, and facial features, achieving state-of-the-art results in the AVEC 2017 challenge.
Contribution
It introduces a multimodal approach for depression severity estimation and demonstrates the effectiveness of facial landmarks and turn features in predicting PHQ-8 scores.
Findings
Facial landmarks features achieved the lowest MAE of 4.66.
Speech behavioral features resulted in an MAE of 4.73.
Turn features from audio transcriptions achieved the best test MAE of 4.11.
Abstract
Depression is a major debilitating disorder which can affect people from all ages. With a continuous increase in the number of annual cases of depression, there is a need to develop automatic techniques for the detection of the presence and extent of depression. In this AVEC challenge we explore different modalities (speech, language and visual features extracted from face) to design and develop automatic methods for the detection of depression. In psychology literature, the PHQ-8 questionnaire is well established as a tool for measuring the severity of depression. In this paper we aim to automatically predict the PHQ-8 scores from features extracted from the different modalities. We show that visual features extracted from facial landmarks obtain the best performance in terms of estimating the PHQ-8 results with a mean absolute error (MAE) of 4.66 on the development set. Behavioral…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
