# Detecting Cyberbullying and Cyberaggression in Social Media

**Authors:** Despoina Chatzakou, Ilias Leontiadis, Jeremy Blackburn, Emiliano De, Cristofaro, Gianluca Stringhini, Athena Vakali, and Nicolas Kourtellis

arXiv: 1907.08873 · 2019-07-23

## TL;DR

This paper analyzes Twitter data to identify cyberbullying and cyberaggression, developing a machine learning approach that achieves over 90% accuracy in classifying abusive users, with implications for social media moderation.

## Contribution

It introduces a robust methodology combining text, user, and network features to detect abusive behavior on Twitter with high accuracy, addressing a critical social issue.

## Key findings

- Over 90% accuracy in classifying abusive accounts
- Effective differentiation between normal and hate-related discussions
- Insights into potential mechanisms for user suspension

## Abstract

Cyberbullying and cyberaggression are increasingly worrisome phenomena affecting people across all demographics. More than half of young social media users worldwide have been exposed to such prolonged and/or coordinated digital harassment. Victims can experience a wide range of emotions, with negative consequences such as embarrassment, depression, isolation from other community members, which embed the risk to lead to even more critical consequences, such as suicide attempts.   In this work, we take the first concrete steps to understand the characteristics of abusive behavior in Twitter, one of today's largest social media platforms. We analyze 1.2 million users and 2.1 million tweets, comparing users participating in discussions around seemingly normal topics like the NBA, to those more likely to be hate-related, such as the Gamergate controversy, or the gender pay inequality at the BBC station. We also explore specific manifestations of abusive behavior, i.e., cyberbullying and cyberaggression, in one of the hate-related communities (Gamergate). We present a robust methodology to distinguish bullies and aggressors from normal Twitter users by considering text, user, and network-based attributes. Using various state-of-the-art machine learning algorithms, we classify these accounts with over 90% accuracy and AUC. Finally, we discuss the current status of Twitter user accounts marked as abusive by our methodology, and study the performance of potential mechanisms that can be used by Twitter to suspend users in the future.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1907.08873/full.md

## Figures

45 figures with captions in the complete paper: https://tomesphere.com/paper/1907.08873/full.md

## References

102 references — full list in the complete paper: https://tomesphere.com/paper/1907.08873/full.md

---
Source: https://tomesphere.com/paper/1907.08873