Challenges for Toxic Comment Classification: An In-Depth Error Analysis

Betty van Aken; Julian Risch; Ralf Krestel; Alexander L\"oser

arXiv:1809.07572·cs.CL·September 21, 2018

Challenges for Toxic Comment Classification: An In-Depth Error Analysis

Betty van Aken, Julian Risch, Ralf Krestel, Alexander L\"oser

PDF

TL;DR

This paper compares various models for toxic comment classification on large datasets, introduces an ensemble approach that outperforms individual models, and analyzes errors to identify key challenges like context understanding and label inconsistencies.

Contribution

It presents a comprehensive comparison of deep learning and shallow models, proposes an effective ensemble, and provides an in-depth error analysis highlighting future research directions.

Findings

01

Ensemble outperforms individual models

02

Identifies missing context as a key challenge

03

Highlights issues with dataset label inconsistencies

Abstract

Toxic comment classification has become an active research field with many recently proposed approaches. However, while these approaches address some of the task's challenges others still remain unsolved and directions for further research are needed. To this end, we compare different deep learning and shallow approaches on a new, large comment dataset and propose an ensemble that outperforms all individual models. Further, we validate our findings on a second dataset. The results of the ensemble enable us to perform an extensive error analysis, which reveals open challenges for state-of-the-art methods and directions towards pending future research. These challenges include missing paradigmatic context and inconsistent dataset labels.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.