A Decade of News Forum Interactions: Threaded Conversations, Signed Votes, and Topical Tags
Emma Fraxanet, Vicen\c{c} G\'omez, Andreas Kaltenbrunner, Max Pellert

TL;DR
This paper introduces a comprehensive, anonymized, ten-year dataset of online discussions from DerStandard, enabling advanced research on discourse dynamics, user interactions, and topical analysis in German media.
Contribution
It provides a large-scale, longitudinal, anonymized dataset with detailed interaction metadata and semantic representations, facilitating diverse computational social science research.
Findings
Rich insights into online discussion patterns
Effective anonymization preserving data utility
Support for multilingual and semantic analyses
Abstract
We present a large-scale, longitudinal dataset capturing user activity on the online platform of DerStandard, a major Austrian newspaper. The dataset spans ten years (2013-2022) and includes over 75 million user comments, more than 400 million votes, and detailed metadata on articles and user interactions. It provides structured conversation threads, explicit up- and downvotes of user comments and editorial topic labels, enabling rich analyses of online discourse while preserving user privacy. To ensure this privacy, all persistent identifiers are anonymized using salted hash functions, and the raw comment texts are not publicly shared. Instead, we release pre-computed vector representations derived from a state-of-the-art embedding model. The dataset supports research on discussion dynamics, network structures, and semantic analyses in the mid-resourced language German, offering a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComputational and Text Analysis Methods · Authorship Attribution and Profiling · Complex Network Analysis Techniques
