Learning to Disentangle Robust and Vulnerable Features for Adversarial   Detection

Byunggill Joe; Sung Ju Hwang; Insik Shin

arXiv:1909.04311·cs.LG·September 11, 2019

Learning to Disentangle Robust and Vulnerable Features for Adversarial Detection

Byunggill Joe, Sung Ju Hwang, Insik Shin

PDF

Open Access

TL;DR

This paper introduces a novel method to disentangle robust and vulnerable features in neural networks using variational autoencoders, improving adversarial detection against both blackbox and whitebox attacks.

Contribution

It proposes a minimax game framework to separate robust and vulnerable features, enhancing adversarial detection and understanding of adversarial inputs.

Findings

01

Effective detection of adversarial inputs on multiple datasets

02

Robust features resist adversarial perturbations

03

Vulnerable features are key to understanding adversarial success

Abstract

Although deep neural networks have shown promising performances on various tasks, even achieving human-level performance on some, they are shown to be susceptible to incorrect predictions even with imperceptibly small perturbations to an input. There exists a large number of previous works which proposed to defend against such adversarial attacks either by robust inference or detection of adversarial inputs. Yet, most of them cannot effectively defend against whitebox attacks where an adversary has a knowledge of the model and defense. More importantly, they do not provide a convincing reason why the generated adversarial inputs successfully fool the target models. To address these shortcomings of the existing approaches, we hypothesize that the adversarial inputs are tied to latent features that are susceptible to adversarial perturbation, which we call vulnerable features. Then based…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications · Forensic and Genetic Research