Measuring Depression Symptom Severity from Spoken Language and 3D Facial Expressions
Albert Haque, Michelle Guo, Adam S Miner, Li Fei-Fei

TL;DR
This paper introduces a multi-modal machine learning approach combining 3D facial expressions and spoken language to accurately measure depression severity and detect major depressive disorder, potentially enabling accessible mental health diagnostics via smartphones.
Contribution
It presents a novel multi-modal method using facial and speech data from smartphones to assess depression severity and diagnose depression with high accuracy.
Findings
Average error of 3.67 points on PHQ scale
83.3% sensitivity for depression detection
82.6% specificity for depression detection
Abstract
With more than 300 million people depressed worldwide, depression is a global problem. Due to access barriers such as social stigma, cost, and treatment availability, 60% of mentally-ill adults do not receive any mental health services. Effective and efficient diagnosis relies on detecting clinical symptoms of depression. Automatic detection of depressive symptoms would potentially improve diagnostic accuracy and availability, leading to faster intervention. In this work, we present a machine learning method for measuring the severity of depressive symptoms. Our multi-modal method uses 3D facial expressions and spoken language, commonly available from modern cell phones. It demonstrates an average error of 3.67 points (15.3% relative) on the clinically-validated Patient Health Questionnaire (PHQ) scale. For detecting major depressive disorder, our model demonstrates 83.3% sensitivity…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMental Health via Writing · Emotion and Mood Recognition · Functional Brain Connectivity Studies
