Is feature selection secure against training data poisoning?

Huang Xiao; Battista Biggio; Gavin Brown; Giorgio Fumera; Claudia; Eckert; Fabio Roli

arXiv:1804.07933·cs.LG·April 24, 2018·288 cites

Is feature selection secure against training data poisoning?

Huang Xiao, Battista Biggio, Gavin Brown, Giorgio Fumera, Claudia, Eckert, Fabio Roli

PDF

Open Access

TL;DR

This paper investigates the robustness of feature selection methods like LASSO, ridge, and elastic net against training data poisoning attacks, revealing significant vulnerabilities especially in malware detection scenarios.

Contribution

It provides a framework to evaluate the security of feature selection methods under adversarial poisoning attacks, demonstrating their susceptibility and the need for countermeasures.

Findings

01

Feature selection methods can be significantly compromised by poisoning attacks.

02

LASSO can be reduced to near-random feature choices with less than 5% poisoned data.

03

The study highlights the importance of developing defenses for feature selection in adversarial settings.

Abstract

Learning in adversarial settings is becoming an important task for application domains where attackers may inject malicious data into the training set to subvert normal operation of data-driven technologies. Feature selection has been widely used in machine learning for security applications to improve generalization and computational efficiency, although it is not clear whether its use may be beneficial or even counterproductive when training data are poisoned by intelligent attackers. In this work, we shed light on this issue by providing a framework to investigate the robustness of popular feature selection methods, including LASSO, ridge regression and the elastic net. Our results on malware detection show that feature selection methods can be significantly compromised under attack (we can reduce LASSO to almost random choices of feature sets by careful insertion of less than 5%…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNetwork Security and Intrusion Detection · Adversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications