Weakly-Supervised Methods for Suicide Risk Assessment: Role of Related Domains
Chenghao Yang, Yudong Zhang, Smaranda Muresan

TL;DR
This paper explores weakly-supervised machine learning methods to assess suicide risk on Reddit, demonstrating that leveraging related mental health domains enhances model accuracy despite limited labeled data.
Contribution
It introduces a weakly-supervised approach using pseudo-labeling from related mental health issues to improve suicide risk assessment models.
Findings
Pseudo-labeling from related domains improves model performance.
Weakly-supervised methods outperform fully supervised baselines.
Leveraging related mental health data mitigates limited labeled data issues.
Abstract
Social media has become a valuable resource for the study of suicidal ideation and the assessment of suicide risk. Among social media platforms, Reddit has emerged as the most promising one due to its anonymity and its focus on topic-based communities (subreddits) that can be indicative of someone's state of mind or interest regarding mental health disorders such as r/SuicideWatch, r/Anxiety, r/depression. A challenge for previous work on suicide risk assessment has been the small amount of labeled data. We propose an empirical investigation into several classes of weakly-supervised approaches, and show that using pseudo-labeling based on related issues around mental health (e.g., anxiety, depression) helps improve model performance for suicide risk assessment.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMental Health via Writing · Suicide and Self-Harm Studies · Topic Modeling
