Uncovering the Internet's Hidden Values: An Empirical Study of Desirable Behavior Using Highly-Upvoted Content on Reddit
Agam Goyal, Charlotte Lambert, Yoshee Jain, Eshwar Chandrasekharan

TL;DR
This study analyzes highly-upvoted Reddit comments over two years to extract community values, revealing limitations of existing prosociality models and highlighting the need for nuanced desirability measures.
Contribution
It introduces a large-scale, LLM-based approach to identify community values from upvoted content, uncovering new values and assessing existing models' adequacy.
Findings
Existing models captured only 18% of the values.
The approach identified 64 and 72 macro, meso, micro values in 2016 and 2022.
The method uncovered new community values beyond prosocial measures.
Abstract
A major task for moderators of online spaces is norm-setting, essentially creating shared norms for user behavior in their communities. Platform design principles emphasize the importance of highlighting norm-adhering examples and explicitly stating community norms. However, norms and values vary between communities and go beyond content-level attributes, making it challenging for platforms and researchers to provide automated ways to identify desirable behavior to be highlighted. Current automated approaches to detect desirability are limited to measures of prosocial behavior, but we do not know whether these measures fully capture the spectrum of what communities value. In this paper, we use upvotes, which express community approval, as a proxy for desirability and examine 16,000 highly-upvoted comments across 80 popular sub-communities on Reddit. Using a large language model, we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
