Improving Moderation of Online Discussions via Interpretable Neural   Models

Andrej \v{S}vec; Mat\'u\v{s} Pikuliak; Mari\'an \v{S}imko; M\'aria; Bielikov\'a (Slovak University of Technology in Bratislava; Bratislava,; Slovakia)

arXiv:1809.06906·cs.CL·September 20, 2018

Improving Moderation of Online Discussions via Interpretable Neural Models

Andrej \v{S}vec, Mat\'u\v{s} Pikuliak, Mari\'an \v{S}imko, M\'aria, Bielikov\'a (Slovak University of Technology in Bratislava, Bratislava,, Slovakia)

PDF

Open Access

TL;DR

This paper introduces an interpretable neural network approach to assist online discussion moderation by automatically detecting and highlighting inappropriate comments, thereby easing the burden on human moderators.

Contribution

It presents a novel two-step neural model that detects and highlights inappropriate comments, improving moderation efficiency and interpretability.

Findings

01

Effective detection of inappropriate comments on Slovak news platform

02

Highlights problematic parts within comments for faster moderation

03

Demonstrates potential to assist human moderators in online discussions

Abstract

Growing amount of comments make online discussions difficult to moderate by human moderators only. Antisocial behavior is a common occurrence that often discourages other users from participating in discussion. We propose a neural network based method that partially automates the moderation process. It consists of two steps. First, we detect inappropriate comments for moderators to see. Second, we highlight inappropriate parts within these comments to make the moderation faster. We evaluated our method on data from a major Slovak news discussion platform.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHate Speech and Cyberbullying Detection