Defeating Misclassification Attacks Against Transfer Learning

Bang Wu; Shuo Wang; Xingliang Yuan; Cong Wang; Carsten; Rudolph; Xiangwen Yang

arXiv:1908.11230·cs.LG·February 10, 2022

Defeating Misclassification Attacks Against Transfer Learning

Bang Wu, Shuo Wang, Xingliang Yuan, Cong Wang, Carsten, Rudolph, Xiangwen Yang

PDF

Open Access

TL;DR

This paper introduces a novel defense mechanism against misclassification attacks in transfer learning by using activation-based network pruning and ensemble strategies, significantly improving robustness with minimal accuracy loss.

Contribution

It presents a new distillation-based differentiator and a two-phase ensemble defense to effectively mitigate advanced misclassification attacks in transfer learning systems.

Findings

01

Student models with 5 differentiators resist over 90% of adversarial inputs

02

Defense maintains less than 10% accuracy loss on recognition tasks

03

Outperforms previous defense methods in robustness and efficiency

Abstract

Transfer learning is prevalent as a technique to efficiently generate new models (Student models) based on the knowledge transferred from a pre-trained model (Teacher model). However, Teacher models are often publicly available for sharing and reuse, which inevitably introduces vulnerability to trigger severe attacks against transfer learning systems. In this paper, we take a first step towards mitigating one of the most advanced misclassification attacks in transfer learning. We design a distilled differentiator via activation-based network pruning to enervate the attack transferability while retaining accuracy. We adopt an ensemble structure from variant differentiators to improve the defence robustness. To avoid the bloated ensemble size during inference, we propose a two-phase defence, in which inference from the Student model is firstly performed to narrow down the candidate…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Domain Adaptation and Few-Shot Learning · Viral Infections and Outbreaks Research

MethodsPruning