Where Do People Tell Stories Online? Story Detection Across Online Communities
Maria Antoniak, Joel Mire, Maarten Sap, Elliott Ash, Andrew Piper

TL;DR
This paper introduces the StorySeeker toolkit, a dataset, codebook, and models for detecting stories in online communities, specifically Reddit, revealing how storytelling varies and functions across different social media contexts.
Contribution
It provides a new annotated dataset, a detailed codebook, and models for story detection at multiple levels, advancing research in online storytelling analysis.
Findings
Storytelling spans have distinctive textual features.
Storytelling distribution varies across communities.
Models effectively predict storytelling in online posts.
Abstract
Story detection in online communities is a challenging task as stories are scattered across communities and interwoven with non-storytelling spans within a single text. We address this challenge by building and releasing the StorySeeker toolkit, including a richly annotated dataset of 502 Reddit posts and comments, a detailed codebook adapted to the social media context, and models to predict storytelling at the document and span levels. Our dataset is sampled from hundreds of popular English-language Reddit communities ranging across 33 topic categories, and it contains fine-grained expert annotations, including binary story labels, story spans, and event spans. We evaluate a range of detection methods using our data, and we identify the distinctive textual features of online storytelling, focusing on storytelling spans. We illuminate distributional characteristics of storytelling on a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsDigital Storytelling and Education · Video Analysis and Summarization
