Robust Transferable Feature Extractors: Learning to Defend Pre-Trained   Networks Against White Box Adversaries

Alexander Cann; Ian Colbert; Ihab Amer

arXiv:2209.06931·cs.LG·September 16, 2022

Robust Transferable Feature Extractors: Learning to Defend Pre-Trained Networks Against White Box Adversaries

Alexander Cann, Ian Colbert, Ihab Amer

PDF

Open Access

TL;DR

This paper introduces a robust transferable feature extractor (RTFE) that enhances adversarial robustness of pre-trained models against white-box attacks, demonstrating transferability and one-shot robustness across models and datasets.

Contribution

The paper proposes a novel RTFE method that transfers adversarial defenses to independently trained models, improving robustness against white-box adversaries.

Findings

01

RTFE provides adversarial robustness to multiple pre-trained classifiers.

02

RTFE achieves one-shot robustness across different datasets.

03

The method is effective against adaptive white-box adversaries.

Abstract

The widespread adoption of deep neural networks in computer vision applications has brought forth a significant interest in adversarial robustness. Existing research has shown that maliciously perturbed inputs specifically tailored for a given model (i.e., adversarial examples) can be successfully transferred to another independently trained model to induce prediction errors. Moreover, this property of adversarial examples has been attributed to features derived from predictive patterns in the data distribution. Thus, we are motivated to investigate the following question: Can adversarial defenses, like adversarial examples, be successfully transferred to other independently trained models? To this end, we propose a deep learning-based pre-processing mechanism, which we refer to as a robust transferable feature extractor (RTFE). After examining theoretical motivation and implications,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning