A Survey on Multilingual Mental Disorders Detection from Social Media Data
Ana-Maria Bucur, Marcos Zampieri, Tharindu Ranasinghe, Fabio Crestani

TL;DR
This survey reviews multilingual social media data for mental disorder detection, emphasizing the need for diverse language resources and cultural considerations to improve global digital mental health screening.
Contribution
It compiles 108 datasets across 25 languages and discusses cultural and resource challenges in multilingual mental health NLP research.
Findings
Most datasets focus on English, limiting global applicability.
Low-resource languages lack sufficient datasets and tools.
Depression is the most studied mental disorder in social media data.
Abstract
The increasing prevalence of mental disorders globally highlights the urgent need for effective digital screening methods that can be used in multilingual contexts. Most existing studies, however, focus on English data, overlooking critical mental health signals that may be present in non-English texts. To address this gap, we present a survey of the detection of mental disorders using social media data beyond the English language. We compile a comprehensive list of 108 datasets spanning 25 languages that can be used for developing NLP models for mental health screening. In addition, we discuss the cultural nuances that influence online language patterns and self-disclosure behaviors, and how these factors can impact the performance of NLP tools. Our survey highlights major challenges, including the scarcity of resources for low- and mid-resource languages and the dominance of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsMental Health via Writing · Sentiment Analysis and Opinion Mining
MethodsFocus
