Deep Neural Networks for Bot Detection

Sneha Kudugunta; Emilio Ferrara

arXiv:1802.04289·cs.AI·September 27, 2018

Deep Neural Networks for Bot Detection

Sneha Kudugunta, Emilio Ferrara

PDF

2 Repos

TL;DR

This paper introduces a deep neural network using LSTM architecture that detects Twitter bots at the tweet level with high accuracy by combining content and metadata, and demonstrates effective training with minimal labeled data.

Contribution

It presents a novel LSTM-based model that exploits both tweet content and metadata for bot detection, and introduces a synthetic oversampling technique for training with limited data.

Findings

01

Achieves over 96% AUC for tweet-level bot detection.

02

Nearly perfect accuracy (AUC > 99%) at account-level detection.

03

Outperforms previous methods with minimal feature set and training data.

Abstract

The problem of detecting bots, automated social media accounts governed by software but disguising as human users, has strong implications. For example, bots have been used to sway political elections by distorting online discourse, to manipulate the stock market, or to push anti-vaccine conspiracy theories that caused health epidemics. Most techniques proposed to date detect bots at the account level, by processing large amount of social media posts, and leveraging information from network structure, temporal dynamics, sentiment analysis, etc. In this paper, we propose a deep neural network based on contextual long short-term memory (LSTM) architecture that exploits both content and metadata to detect bots at the tweet level: contextual features are extracted from user metadata and fed as auxiliary input to LSTM deep nets processing the tweet text. Another contribution that we make…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsSigmoid Activation · Tanh Activation · Long Short-Term Memory