Robin: A Novel Online Suicidal Text Corpus of Substantial Breadth and Scale
Daniel DiPietro, Vivek Hazari, Soroush Vosoughi

TL;DR
Robin is the largest publicly available dataset of online suicidal text, enabling improved machine learning detection of suicidal ideation with state-of-the-art results, and includes diverse categories of suicidal language.
Contribution
The paper introduces Robin, a large-scale, diverse suicidal text corpus, and demonstrates its effectiveness in enhancing machine learning models for suicidal ideation detection.
Findings
State-of-the-art classification performance achieved (F1=0.92 with BERT)
Robin dataset includes diverse categories of suicidal language
Large-scale dataset enables better model training and understanding
Abstract
Suicide is a major public health crisis. With more than 20,000,000 suicide attempts each year, the early detection of suicidal intent has the potential to save hundreds of thousands of lives. Traditional mental health screening methods are time-consuming, costly, and often inaccessible to disadvantaged populations; online detection of suicidal intent using machine learning offers a viable alternative. Here we present Robin, the largest non-keyword generated suicidal corpus to date, consisting of over 1.1 million online forum postings. In addition to its unprecedented size, Robin is specially constructed to include various categories of suicidal text, such as suicide bereavement and flippant references, better enabling models trained on Robin to learn the subtle nuances of text expressing suicidal ideation. Experimental results achieve state-of-the-art performance for the classification…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMental Health via Writing · Suicide and Self-Harm Studies · Grief, Bereavement, and Mental Health
MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · WordPiece · Adam · Softmax · Dropout · Dense Connections · Residual Connection · Weight Decay
