On the efficacy of old features for the detection of new bots

Rocco De Nicola; Marinella Petrocchi; Manuel Pratelli

arXiv:2506.19635·cs.CR·June 25, 2025

On the efficacy of old features for the detection of new bots

Rocco De Nicola, Marinella Petrocchi, Manuel Pratelli

PDF

TL;DR

This study evaluates the effectiveness of traditional and inexpensive Twitter account features in detecting new, evolved bots, suggesting that simple classifiers can still be effective against sophisticated malicious accounts.

Contribution

It compares the performance of various feature sets, including old and cheap features, for bot detection on recent Twitter datasets, highlighting their potential usefulness.

Findings

01

Cheap features perform well in detecting evolved bots

02

General-purpose classifiers can be effective with simple features

03

Old features remain relevant for bot detection

Abstract

For more than a decade now, academicians and online platform administrators have been studying solutions to the problem of bot detection. Bots are computer algorithms whose use is far from being benign: malicious bots are purposely created to distribute spam, sponsor public characters and, ultimately, induce a bias within the public opinion. To fight the bot invasion on our online ecosystem, several approaches have been implemented, mostly based on (supervised and unsupervised) classifiers, which adopt the most varied account features, from the simplest to the most expensive ones to be extracted from the raw data obtainable through the Twitter public APIs. In this exploratory study, using Twitter as a benchmark, we compare the performances of four state-of-art feature sets in detecting novel bots: one of the output scores of the popular bot detector Botometer, which considers more than…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsHierarchical Information Threading · ADaptive gradient method with the OPTimal convergence rate