# Predicting the Type and Target of Offensive Posts in Social Media

**Authors:** Marcos Zampieri, Shervin Malmasi, Preslav Nakov, Sara Rosenthal, Noura, Farra, Ritesh Kumar

arXiv: 1902.09666 · 2019-04-17

## TL;DR

This paper introduces a hierarchical approach to identify and categorize offensive social media posts, utilizing a new annotated dataset called OLID, and compares various machine learning models for this task.

## Contribution

It presents the OLID dataset with a detailed annotation scheme and models offensive content detection as a hierarchical classification problem.

## Key findings

- OLID dataset is publicly available for research.
- Hierarchical modeling improves offensive content classification.
- Comparison of machine learning models highlights effective approaches.

## Abstract

As offensive content has become pervasive in social media, there has been much research in identifying potentially offensive messages. However, previous work on this topic did not consider the problem as a whole, but rather focused on detecting very specific types of offensive content, e.g., hate speech, cyberbulling, or cyber-aggression. In contrast, here we target several different kinds of offensive content. In particular, we model the task hierarchically, identifying the type and the target of offensive messages in social media. For this purpose, we complied the Offensive Language Identification Dataset (OLID), a new dataset with tweets annotated for offensive content using a fine-grained three-layer annotation scheme, which we make publicly available. We discuss the main similarities and differences between OLID and pre-existing datasets for hate speech identification, aggression detection, and similar tasks. We further experiment with and we compare the performance of different machine learning models on OLID.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1902.09666/full.md

## References

18 references — full list in the complete paper: https://tomesphere.com/paper/1902.09666/full.md

---
Source: https://tomesphere.com/paper/1902.09666