Sifting Robotic from Organic Text: A Natural Language Approach for   Detecting Automation on Twitter

Eric M. Clark; Jake Ryland Williams; Chris A. Jones; Richard A.; Galbraith; Christopher M. Danforth; Peter Sheridan Dodds

arXiv:1505.04342·cs.CL·June 15, 2016

Sifting Robotic from Organic Text: A Natural Language Approach for Detecting Automation on Twitter

Eric M. Clark, Jake Ryland Williams, Chris A. Jones, Richard A., Galbraith, Christopher M. Danforth, Peter Sheridan Dodds

PDF

TL;DR

This paper introduces a natural language processing method to detect automated accounts on Twitter solely based on their text content, offering a flexible tool applicable to various textual datasets.

Contribution

The study presents a novel text-only classification approach for identifying Twitter bots, moving beyond metadata-based detection methods.

Findings

01

Effective in distinguishing bots from organic users using text analysis

02

Applicable to other textual data beyond Twitter

03

Operates independently of account metadata

Abstract

Twitter, a popular social media outlet, has evolved into a vast source of linguistic data, rich with opinion, sentiment, and discussion. Due to the increasing popularity of Twitter, its perceived potential for exerting social influence has led to the rise of a diverse community of automatons, commonly referred to as bots. These inorganic and semi-organic Twitter entities can range from the benevolent (e.g., weather-update bots, help-wanted-alert bots) to the malevolent (e.g., spamming messages, advertisements, or radical opinions). Existing detection algorithms typically leverage meta-data (time between tweets, number of followers, etc.) to identify robotic accounts. Here, we present a powerful classification scheme that exclusively uses the natural language text from organic users to provide a criterion for identifying accounts posting automated messages. Since the classifier operates…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.