# One-step and Two-step Classification for Abusive Language Detection on   Twitter

**Authors:** Ji Ho Park, Pascale Fung

arXiv: 1706.01206 · 2017-06-06

## TL;DR

This paper compares one-step and two-step classification methods for detecting abusive language on Twitter, showing that both approaches achieve high F-measures with different models on a Twitter dataset.

## Contribution

It introduces and evaluates a two-step classification approach for abusive language detection and compares it with a one-step multi-class method.

## Key findings

- HybridCNN achieves 0.827 F-measure in one-step classification.
- Logistic regression achieves 0.824 F-measure in two-step classification.
- Both methods show promising performance on Twitter data.

## Abstract

Automatic abusive language detection is a difficult but important task for online social media. Our research explores a two-step approach of performing classification on abusive language and then classifying into specific types and compares it with one-step approach of doing one multi-class classification for detecting sexist and racist languages. With a public English Twitter corpus of 20 thousand tweets in the type of sexism and racism, our approach shows a promising performance of 0.827 F-measure by using HybridCNN in one-step and 0.824 F-measure by using logistic regression in two-steps.

---
Source: https://tomesphere.com/paper/1706.01206