Towards Automatic Bot Detection in Twitter for Health-related Tasks

Anahita Davoudi; Ari Z. Klein; Abeed Sarker; Graciela; Gonzalez-Hernandez

arXiv:1909.13184·cs.CL·October 1, 2019·5 cites

Towards Automatic Bot Detection in Twitter for Health-related Tasks

Anahita Davoudi, Ari Z. Klein, Abeed Sarker, Graciela, Gonzalez-Hernandez

PDF

Open Access

TL;DR

This paper adapts and improves an existing Twitter bot detection system specifically for health-related social media data, achieving significantly better accuracy in identifying automated accounts posting health information.

Contribution

It extends a political bot detection system with new features and machine learning techniques tailored for health-related Twitter users, enhancing detection performance.

Findings

01

F1 score of 0.7 for bot detection in health-related tweets

02

Significant improvement over previous methods with an increase of 0.339 in F1 score

03

The approach is customizable and generalizable to other health-related social media cohorts

Abstract

With the increasing use of social media data for health-related research, the credibility of the information from this source has been questioned as the posts may originate from automated accounts or "bots". While automatic bot detection approaches have been proposed, there are none that have been evaluated on users posting health-related information. In this paper, we extend an existing bot detection system and customize it for health-related research. Using a dataset of Twitter users, we first show that the system, which was designed for political bot detection, underperforms when applied to health-related Twitter users. We then incorporate additional features and a statistical machine learning classifier to significantly improve bot detection performance. Our approach obtains F_1 scores of 0.7 for the "bot" class, representing improvements of 0.339. Our approach is customizable and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpam and Phishing Detection · Misinformation and Its Impacts · Topic Modeling