Effective Targeted Attacks for Adversarial Self-Supervised Learning

Minseon Kim; Hyeonjeong Ha; Sooel Son; Sung Ju Hwang

arXiv:2210.10482·cs.LG·October 27, 2023·1 cites

Effective Targeted Attacks for Adversarial Self-Supervised Learning

Minseon Kim, Hyeonjeong Ha, Sooel Son, Sung Ju Hwang

PDF

Open Access 1 Video

TL;DR

This paper introduces a targeted adversarial attack method for self-supervised learning that improves model robustness by selecting and perturbing instances towards similar, confusing targets, especially benefiting non-contrastive SSL frameworks.

Contribution

We propose a positive mining algorithm for targeted adversarial attacks that enhances robustness in self-supervised learning, addressing limitations of untargeted attacks.

Findings

01

Significant robustness improvements in non-contrastive SSL frameworks.

02

Moderate but consistent robustness gains in contrastive SSL frameworks.

03

Effective adversarial examples generated by targeting similar, confusing instances.

Abstract

Recently, unsupervised adversarial training (AT) has been highlighted as a means of achieving robustness in models without any label information. Previous studies in unsupervised AT have mostly focused on implementing self-supervised learning (SSL) frameworks, which maximize the instance-wise classification loss to generate adversarial examples. However, we observe that simply maximizing the self-supervised training loss with an untargeted adversarial attack often results in generating ineffective adversaries that may not help improve the robustness of the trained model, especially for non-contrastive SSL frameworks without negative examples. To tackle this problem, we propose a novel positive mining for targeted adversarial attack to generate effective adversaries for adversarial SSL frameworks. Specifically, we introduce an algorithm that selects the most confusing yet similar target…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Effective Targeted Attacks for Adversarial Self-Supervised Learning· slideslive

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications · Machine Learning and Data Classification