Rethinking Suicidal Ideation Detection: A Trustworthy Annotation Framework and Cross-Lingual Model Evaluation
Amina Dzafic, Merve Kavut, Ulya Bayram

TL;DR
This paper introduces a trustworthy annotation framework and evaluates cross-lingual models for suicidal ideation detection, highlighting challenges in annotation reliability, language coverage, and model performance in mental health NLP.
Contribution
It presents a novel Turkish suicidal ideation dataset, a resource-efficient annotation method, and a comprehensive evaluation of model transferability and reliability across languages.
Findings
High annotation inconsistency in existing datasets
Transformers show limited zero-shot transfer performance
Need for more rigorous, transparent annotation and evaluation practices
Abstract
Suicidal ideation detection is critical for real-time suicide prevention, yet its progress faces two under-explored challenges: limited language coverage and unreliable annotation practices. Most available datasets are in English, but even among these, high-quality, human-annotated data remains scarce. As a result, many studies rely on available pre-labeled datasets without examining their annotation process or label reliability. The lack of datasets in other languages further limits the global realization of suicide prevention via artificial intelligence (AI). In this study, we address one of these gaps by constructing a novel Turkish suicidal ideation corpus derived from social media posts and introducing a resource-efficient annotation framework involving three human annotators and two large language models (LLMs). We then address the remaining gaps by performing a bidirectional…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
