Decoder-free Robustness Disentanglement without (Additional) Supervision

Yifei Wang; Dan Peng; Furui Liu; Zhenguo Li; Zhitang Chen; Jiansheng; Yang

arXiv:2007.01356·stat.ML·July 6, 2020

Decoder-free Robustness Disentanglement without (Additional) Supervision

Yifei Wang, Dan Peng, Furui Liu, Zhenguo Li, Zhitang Chen, Jiansheng, Yang

PDF

Open Access

TL;DR

This paper introduces a novel adversarial training method that disentangles robust and non-robust features without extra supervision, improving model robustness and interpretability.

Contribution

The proposed Adversarial Asymmetric Training (AAT) method effectively separates robust and non-robust features without additional labels, enhancing disentanglement and accuracy.

Findings

01

Successfully preserves accuracy by combining two representations

02

Achieves better disentanglement than previous methods

03

No additional supervision needed for robustness separation

Abstract

Adversarial Training (AT) is proposed to alleviate the adversarial vulnerability of machine learning models by extracting only robust features from the input, which, however, inevitably leads to severe accuracy reduction as it discards the non-robust yet useful features. This motivates us to preserve both robust and non-robust features and separate them with disentangled representation learning. Our proposed Adversarial Asymmetric Training (AAT) algorithm can reliably disentangle robust and non-robust representations without additional supervision on robustness. Empirical results show our method does not only successfully preserve accuracy by combining two representations, but also achieve much better disentanglement than previous work.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications · Digital Media Forensic Detection