Suicide Risk Assessment on Social Media with Semi-Supervised Learning
Max Lovitt, Haotian Ma, Song Wang, and Yifan Peng

TL;DR
This paper introduces a semi-supervised learning framework using pseudo-labeling and manual verification to improve suicide risk assessment from social media posts, addressing data scarcity and class imbalance issues.
Contribution
It develops a novel pseudo-label acquisition process and demonstrates that leveraging unlabeled data enhances model performance in suicide risk assessment.
Findings
Improved accuracy in suicide risk classification.
Effective handling of imbalanced datasets.
RoBERTa outperforms other models in this task.
Abstract
With social media communities increasingly becoming places where suicidal individuals post and congregate, natural language processing presents an exciting avenue for the development of automated suicide risk assessment systems. However, past efforts suffer from a lack of labeled data and class imbalances within the available labeled data. To accommodate this task's imperfect data landscape, we propose a semi-supervised framework that leverages labeled (n=500) and unlabeled (n=1,500) data and expands upon the self-training algorithm with a novel pseudo-label acquisition process designed to handle imbalanced datasets. To further ensure pseudo-label quality, we manually verify a subset of the pseudo-labeled data that was not predicted unanimously across multiple trials of pseudo-label generation. We test various models to serve as the backbone for this framework, ultimately deciding that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMental Health via Writing · Computational and Text Analysis Methods
MethodsAttention Is All You Need · Linear Layer · Multi-Head Attention · Attention Dropout · Dense Connections · Linear Warmup With Linear Decay · Layer Normalization · Dropout · WordPiece · Refunds@Expedia|||How do I get a full refund from Expedia?
