Automatic Detection of Depression from Stratified Samples of Audio Data
Pongpak Manoret, Punnatorn Chotipurk, Sompoom Sunpaweravong, Chanati, Jantrachotechatchawan, Kobchai Duangrattanalert

TL;DR
This study explores deep learning methods to detect depression from voice recordings, achieving promising results with a 1D CNN-GRU model that could aid mental health diagnosis especially where access to psychiatrists is limited.
Contribution
It introduces a deep learning approach using stratified audio samples and compares different encoders for depression detection from voice data.
Findings
1D CNN-GRU achieved the best performance with an F1 score of 0.75.
Hyperparameter tuning improved model accuracy.
Voice-based depression detection shows potential as a diagnostic tool.
Abstract
Depression is a common mental disorder which has been affecting millions of people around the world and becoming more severe with the arrival of COVID-19. Nevertheless proper diagnosis is not accessible in many regions due to a severe shortage of psychiatrists. This scarcity is worsened in low-income countries which have a psychiatrist to population ratio 210 times lower than that of countries with better economies. This study aimed to explore applications of deep learning in diagnosing depression from voice samples. We collected data from the DAIC-WOZ database which contained 189 vocal recordings from 154 individuals. Voice samples from a patient with a PHQ-8 score equal or higher than 10 were deemed as depressed and those with a PHQ-8 score lower than 10 were considered healthy. We applied mel-spectrogram to extract relevant features from the audio. Three types of encoders were tested…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEmotion and Mood Recognition · Voice and Speech Disorders · Phonocardiography and Auscultation Techniques
