Assessing Vulnerabilities of Adversarial Learning Algorithm through   Poisoning Attacks

Jingfeng Zhang; Bo Song; Bo Han; Lei Liu; Gang Niu; Masashi Sugiyama

arXiv:2305.00399·cs.CR·May 2, 2023·1 cites

Assessing Vulnerabilities of Adversarial Learning Algorithm through Poisoning Attacks

Jingfeng Zhang, Bo Song, Bo Han, Lei Liu, Gang Niu, Masashi Sugiyama

PDF

Open Access 1 Repo

TL;DR

This paper investigates the vulnerabilities of adversarial training (AT) to poisoning attacks, revealing that AT can be compromised through subtle data manipulations, which raises concerns for its use in security-sensitive AI applications.

Contribution

The study designs and tests novel clean-label poisoning attacks against AT, including targeted and untargeted methods, demonstrating AT's susceptibility to such malicious data manipulations.

Findings

01

AT can be effectively poisoned with minimal data modifications

02

Clean-label attacks can control model behavior on specific data points

03

Attacks can degrade overall model performance, even against robust training

Abstract

Adversarial training (AT) is a robust learning algorithm that can defend against adversarial attacks in the inference phase and mitigate the side effects of corrupted data in the training phase. As such, it has become an indispensable component of many artificial intelligence (AI) systems. However, in high-stake AI applications, it is crucial to understand AT's vulnerabilities to ensure reliable deployment. In this paper, we investigate AT's susceptibility to poisoning attacks, a type of malicious attack that manipulates training data to compromise the performance of the trained model. Previous work has focused on poisoning attacks against standard training, but little research has been done on their effectiveness against AT. To fill this gap, we design and test effective poisoning attacks against AT. Specifically, we investigate and design clean-label poisoning attacks, allowing…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

zjfheart/poison-adv-training
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning