Sonic: Fast and Transferable Data Poisoning on Clustering Algorithms
Francesco Villani, Dario Lazzaro, Antonio Emanuele Cin\`a, Matteo, Dell'Amico, Battista Biggio, Fabio Roli

TL;DR
Sonic introduces a scalable, transferability-enabled data poisoning attack on clustering algorithms, utilizing incremental clustering methods to significantly improve attack efficiency and effectiveness on large datasets.
Contribution
The paper presents Sonic, a novel genetic poisoning attack that leverages incremental clustering algorithms to enhance scalability and transferability against various clustering methods.
Findings
Sonic effectively poisons clustering algorithms with high efficiency.
Incremental clustering algorithms enable scalable poisoning attacks.
Hyperparameter robustness analysis improves attack reliability.
Abstract
Data poisoning attacks on clustering algorithms have received limited attention, with existing methods struggling to scale efficiently as dataset sizes and feature counts increase. These attacks typically require re-clustering the entire dataset multiple times to generate predictions and assess the attacker's objectives, significantly hindering their scalability. This paper addresses these limitations by proposing Sonic, a novel genetic data poisoning attack that leverages incremental and scalable clustering algorithms, e.g., FISHDBC, as surrogates to accelerate poisoning attacks against graph-based and density-based clustering methods, such as HDBSCAN. We empirically demonstrate the effectiveness and efficiency of Sonic in poisoning the target clustering algorithms. We then conduct a comprehensive analysis of the factors affecting the scalability and transferability of poisoning…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAnomaly Detection Techniques and Applications · Data Stream Mining Techniques · Topological and Geometric Data Analysis
