# Conversational Networks for Automatic Online Moderation

**Authors:** Etienne Papegnies (LIA), Vincent Labatut (LIA), Richard Dufour (LIA),, Georges Linares (LIA)

arXiv: 1901.11281 · 2019-02-01

## TL;DR

This paper introduces a novel automatic abuse detection system for online chat logs that relies on analyzing conversational network topology rather than message content, achieving high accuracy and efficiency.

## Contribution

It proposes a content-agnostic method using topological features of conversational networks for abuse detection, outperforming existing content-based approaches.

## Key findings

- Achieved an F-measure of 83.89 with full features.
- Reduced computation time to 82.65 F-measure with selected features.
- Demonstrated effectiveness on French MMO game chat logs.

## Abstract

Moderation of user-generated content in an online community is a challenge that has great socio-economical ramifications. However, the costs incurred by delegating this work to human agents are high. For this reason, an automatic system able to detect abuse in user-generated content is of great interest. There are a number of ways to tackle this problem, but the most commonly seen in practice are word filtering or regular expression matching. The main limitations are their vulnerability to intentional obfuscation on the part of the users, and their context-insensitive nature. Moreover, they are language-dependent and may require appropriate corpora for training. In this paper, we propose a system for automatic abuse detection that completely disregards message content. We first extract a conversational network from raw chat logs and characterize it through topological measures. We then use these as features to train a classifier on our abuse detection task. We thoroughly assess our system on a dataset of user comments originating from a French Massively Multiplayer Online Game. We identify the most appropriate network extraction parameters and discuss the discriminative power of our features, relatively to their topological and temporal nature. Our method reaches an F-measure of 83.89 when using the full feature set, improving on existing approaches. With a selection of the most discriminative features, we dramatically cut computing time while retaining most of the performance (82.65).

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1901.11281/full.md

## Figures

14 figures with captions in the complete paper: https://tomesphere.com/paper/1901.11281/full.md

## References

57 references — full list in the complete paper: https://tomesphere.com/paper/1901.11281/full.md

---
Source: https://tomesphere.com/paper/1901.11281