Towards Automatic Bot Detection in Twitter for Health-related Tasks
Anahita Davoudi, Ari Z. Klein, Abeed Sarker, Graciela, Gonzalez-Hernandez

TL;DR
This paper adapts and improves an existing Twitter bot detection system specifically for health-related social media data, achieving significantly better accuracy in identifying automated accounts posting health information.
Contribution
It extends a political bot detection system with new features and machine learning techniques tailored for health-related Twitter users, enhancing detection performance.
Findings
F1 score of 0.7 for bot detection in health-related tweets
Significant improvement over previous methods with an increase of 0.339 in F1 score
The approach is customizable and generalizable to other health-related social media cohorts
Abstract
With the increasing use of social media data for health-related research, the credibility of the information from this source has been questioned as the posts may originate from automated accounts or "bots". While automatic bot detection approaches have been proposed, there are none that have been evaluated on users posting health-related information. In this paper, we extend an existing bot detection system and customize it for health-related research. Using a dataset of Twitter users, we first show that the system, which was designed for political bot detection, underperforms when applied to health-related Twitter users. We then incorporate additional features and a statistical machine learning classifier to significantly improve bot detection performance. Our approach obtains F_1 scores of 0.7 for the "bot" class, representing improvements of 0.339. Our approach is customizable and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpam and Phishing Detection · Misinformation and Its Impacts · Topic Modeling
