# Trollslayer: Crowdsourcing and Characterization of Abusive Birds in   Twitter

**Authors:** Alvaro Garcia-Recuero, Aneta Morawin, Gareth Tyson

arXiv: 1812.06156 · 2018-12-18

## TL;DR

This paper introduces Trollslayer, a new dataset and method for identifying abusive users on Twitter by combining user attributes and social-graph features, including a novel similarity metric, to better understand abuse dynamics.

## Contribution

The paper presents a custom Twitter crawler, a comprehensive feature set for user and message analysis, and introduces the Jaccard index as a novel metric for abuse characterization.

## Key findings

- The Jaccard index effectively distinguishes benign from malicious messages.
- The dataset enables detailed analysis of abusive user behaviors.
- Graph-based features reveal information dissemination patterns.

## Abstract

As of today, abuse is a pressing issue to participants and administrators of Online Social Networks (OSN). Abuse in Twitter can spawn from arguments generated for influencing outcomes of a political election, the use of bots to automatically spread misinformation, and generally speaking, activities that deny, disrupt, degrade or deceive other participants and, or the network. Given the difficulty in finding and accessing a large enough sample of abuse ground truth from the Twitter platform, we built and deployed a custom crawler that we use to judiciously collect a new dataset from the Twitter platform with the aim of characterizing the nature of abusive users, a.k.a abusive birds, in the wild. We provide a comprehensive set of features based on users' attributes, as well as social-graph metadata. The former includes metadata about the account itself, while the latter is computed from the social graph among the sender and the receiver of each message. Attribute-based features are useful to characterize user's accounts in OSN, while graph-based features can reveal the dynamics of information dissemination across the network. In particular, we derive the Jaccard index as a key feature to reveal the benign or malicious nature of directed messages in Twitter. To the best of our knowledge, we are the first to propose such a similarity metric to characterize abuse in Twitter.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1812.06156/full.md

## Figures

22 figures with captions in the complete paper: https://tomesphere.com/paper/1812.06156/full.md

## References

27 references — full list in the complete paper: https://tomesphere.com/paper/1812.06156/full.md

---
Source: https://tomesphere.com/paper/1812.06156