LOBO -- Evaluation of Generalization Deficiencies in Twitter Bot Classifiers
Juan Echeverr\'ia, Emiliano De Cristofaro, Nicolas Kourtellis, Ilias, Leontiadis, Gianluca Stringhini, and Shi Zhou

TL;DR
This paper evaluates the generalization capabilities of Twitter bot classifiers by testing them on unseen bot classes, revealing that high accuracy on known data does not guarantee effectiveness against new bot types.
Contribution
It introduces a methodology for evaluating bot classifiers on unseen classes and demonstrates that current models often fail to generalize beyond their training data.
Findings
Classifiers trained on specific bot classes do not generalize well to new, unseen bot classes.
A classifier trained on over 200,000 data points achieved 97% accuracy on known data.
The methodology provides a robust way to assess the true effectiveness of bot detection systems.
Abstract
Botnets in online social networks are increasingly often affecting the regular flow of discussion, attacking regular users and their posts, spamming them with irrelevant or offensive content, and even manipulating the popularity of messages and accounts. Researchers and cybercriminals are involved in an arms race, and new and updated botnets designed to defeat current detection systems are constantly developed, rendering such detection systems obsolete. In this paper, we motivate the need for a generalized evaluation in Twitter bot detection and propose a methodology to evaluate bot classifiers by testing them on unseen bot classes. We show that this methodology is empirically robust, using bot classes of varying sizes and characteristics and reaching similar results, and argue that methods trained and tested on single bot classes or datasets might not able to generalize to new bot…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpam and Phishing Detection · Advanced Malware Detection Techniques · Network Security and Intrusion Detection
