Mining Mental Health Signals: A Comparative Study of Four Machine Learning Methods for Depression Detection from Social Media Posts in Sorani Kurdish
Idrees Mohammed, Hossein Hassani

TL;DR
This study develops and compares machine learning models to detect depression from social media posts in Sorani Kurdish, addressing a language gap with a new dataset and establishing a baseline for future research.
Contribution
It introduces a novel NLP approach and dataset for depression detection in Sorani Kurdish, comparing four machine learning models and identifying the most effective one.
Findings
Random Forest achieved 80% accuracy and F1-score.
Developed a depression-related keyword set for Sorani Kurdish.
Established a baseline for future depression detection research in Kurdish.
Abstract
Depression is a common mental health condition that can lead to hopelessness, loss of interest, self-harm, and even suicide. Early detection is challenging due to individuals not self-reporting or seeking timely clinical help. With the rise of social media, users increasingly express emotions online, offering new opportunities for detection through text analysis. While prior research has focused on languages such as English, no studies exist for Sorani Kurdish. This work presents a machine learning and Natural Language Processing (NLP) approach to detect depression in Sorani tweets. A set of depression-related keywords was developed with expert input to collect 960 public tweets from X (Twitter platform). The dataset was annotated into three classes: Shows depression, Not-show depression, and Suspicious by academics and final year medical students at the University of Kurdistan…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
