ModSandbox: Facilitating Online Community Moderation Through Error Prediction and Improvement of Automated Rules
Jean Y. Song, Sangwook Lee, Jisoo Lee, Mina Kim, Juho Kim

TL;DR
ModSandbox is a virtual tool that helps online community moderators identify and fix errors in automated moderation rules, reducing false positives and negatives effectively.
Contribution
The paper introduces ModSandbox, a novel system that visualizes and predicts errors in automated moderation rules to assist human moderators in improving rule accuracy.
Findings
ModSandbox supports quick identification of false positives and negatives.
Moderators found ModSandbox helpful in updating rules to reduce errors.
User study shows improved moderation efficiency with ModSandbox.
Abstract
Despite the common use of rule-based tools for online content moderation, human moderators still spend a lot of time monitoring them to ensure that they work as intended. Based on surveys and interviews with Reddit moderators who use AutoModerator, we identified the main challenges in reducing false positives and false negatives of automated rules: not being able to estimate the actual effect of a rule in advance and having difficulty figuring out how the rules should be updated. To address these issues, we built ModSandbox, a novel virtual sandbox system that detects possible false positives and false negatives of a rule to be improved and visualizes which part of the rule is causing issues. We conducted a user study with online content moderators, finding that ModSandbox can support quickly finding possible false positives and false negatives of automated rules and guide moderators to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHate Speech and Cyberbullying Detection · Advanced Malware Detection Techniques · Software Engineering Research
