Dreaddit: A Reddit Dataset for Stress Analysis in Social Media

Elsbeth Turcan; Kathleen McKeown

arXiv:1911.00133·cs.CL·November 4, 2019

Dreaddit: A Reddit Dataset for Stress Analysis in Social Media

Elsbeth Turcan, Kathleen McKeown

PDF

Open Access 1 Repo

TL;DR

Dreaddit is a large Reddit dataset designed for stress detection, comprising 190K posts across multiple categories with labeled segments, enabling research on stress in diverse social media contexts.

Contribution

The paper introduces Dreaddit, a novel multi-domain Reddit dataset with stress annotations, facilitating advanced research in social media stress analysis.

Findings

01

Preliminary supervised models show promise in stress detection.

02

Data complexity varies across different Reddit categories.

03

The dataset enables diverse stress analysis in social media.

Abstract

Stress is a nigh-universal human experience, particularly in the online world. While stress can be a motivator, too much stress is associated with many negative health outcomes, making its identification useful across a range of domains. However, existing computational research typically only studies stress in domains such as speech, or in short genres such as Twitter. We present Dreaddit, a new text corpus of lengthy multi-domain social media data for the identification of stress. Our dataset consists of 190K posts from five different categories of Reddit communities; we additionally label 3.5K total segments taken from 3K posts using Amazon Mechanical Turk. We present preliminary supervised learning methods for identifying stress, both neural and traditional, and analyze the complexity and diversity of the data and characteristics of each category.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

gillian850413/Insight_Stress_Analysis
tf

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMental Health via Writing · Digital Mental Health Interventions · Sentiment Analysis and Opinion Mining