Validating Computational Markers of Depressive Behavior: Cross-Linguistic Speech-Based Depression Detection with Neurophysiological Validation
Fuxiang Tao, Dongwei Li, Shuning Tang, Xuri Ge, Wei Ma, Anna Esposito, Alessandro Vinciarelli

TL;DR
This study validates a speech-based depression detection model across languages and neurophysiological measures, demonstrating its robustness and neurobiological relevance through EEG correlation in Chinese Mandarin.
Contribution
It extends the Cross-Data Multilevel Attention framework to Chinese, showing cross-linguistic robustness and providing the first neurophysiological validation of speech-based depression detection.
Findings
State-of-the-art F1-score of 89.6% on Chinese dataset
Emotionally valenced speech outperforms neutral speech in detection
Significant correlations between depression estimates and neural oscillatory patterns
Abstract
Speech-based depression detection has shown promise as an objective diagnostic tool, yet the cross-linguistic robustness of acoustic markers and their neurobiological underpinnings remain underexplored. This study extends Cross-Data Multilevel Attention (CDMA) framework, initially validated on Italian, to investigate these dimensions using a Chinese Mandarin dataset with Electroencephalography (EEG) recordings. We systematically fuse read speech with spontaneous speech across different emotional valences (positive, neutral, negative) to investigate whether emotional arousal is a more critical factor than valence polarity in enhancing detection performance in speech. Additionally, we establish the first neurophysiological validation for a speech-based depression model by correlating its predictions with neural oscillatory patterns during emotional face processing. Our results demonstrate…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
