Cordyceps@LT-EDI: Depression Detection with Reddit and Self-training
Dean Ninalga

TL;DR
This paper presents a semi-supervised learning framework for detecting depression severity from Reddit posts, leveraging unlabeled data to improve classification accuracy and achieving a top-three ranking in a shared task.
Contribution
It introduces a novel semi-supervised approach that uses model-generated labels on unlabeled social media data to enhance depression detection.
Findings
Achieved 3rd place in the shared task.
Effective use of unlabeled data improves classification.
Demonstrated feasibility of large-scale depression detection.
Abstract
Depression is debilitating, and not uncommon. Indeed, studies of excessive social media users show correlations with depression, ADHD, and other mental health concerns. Given that there is a large number of people with excessive social media usage, then there is a significant population of potentially undiagnosed users and posts that they create. In this paper, we propose a depression severity detection system using a semi-supervised learning technique to predict if a post is from a user who is experiencing severe, moderate, or low (non-diagnostic) levels of depression. Namely, we use a trained model to classify a large number of unlabelled social media posts from Reddit, then use these generated labels to train a more powerful classifier. We demonstrate our framework on Detecting Signs of Depression from Social Media Text - LT-EDI@RANLP 2023 shared task, where our framework ranks 3rd…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDigital Mental Health Interventions
