# Understanding Adversarial Behavior of DNNs by Disentangling Non-Robust   and Robust Components in Performance Metric

**Authors:** Yujun Shi, Benben Liao, Guangyong Chen, Yun Liu, Ming-Ming Cheng,, Jiashi Feng

arXiv: 1906.02494 · 2019-06-07

## TL;DR

This paper introduces a metric that separates the generalization performance of DNNs into robust and non-robust components, revealing how current models rely on non-robust features and how adversarial training enhances robustness.

## Contribution

The work proposes a novel information-theoretic metric to disentangle robust and non-robust components influencing DNN performance, providing insights into adversarial vulnerability and robustness.

## Key findings

- Current DNNs depend heavily on non-robust features for performance.
- Adversarial training suppresses reliance on non-robust components.
- The metric offers a new perspective on balancing accuracy and robustness.

## Abstract

The vulnerability to slight input perturbations is a worrying yet intriguing property of deep neural networks (DNNs). Despite many previous works studying the reason behind such adversarial behavior, the relationship between the generalization performance and adversarial behavior of DNNs is still little understood. In this work, we reveal such relation by introducing a metric characterizing the generalization performance of a DNN. The metric can be disentangled into an information-theoretic non-robust component, responsible for adversarial behavior, and a robust component. Then, we show by experiments that current DNNs rely heavily on optimizing the non-robust component in achieving decent performance. We also demonstrate that current state-of-the-art adversarial training algorithms indeed try to robustify the DNNs by preventing them from using the non-robust component to distinguish samples from different categories. Also, based on our findings, we take a step forward and point out the possible direction for achieving decent standard performance and adversarial robustness simultaneously. We believe that our theory could further inspire the community to make more interesting discoveries about the relationship between standard generalization and adversarial generalization of deep learning models.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1906.02494/full.md

## Figures

13 figures with captions in the complete paper: https://tomesphere.com/paper/1906.02494/full.md

## References

33 references — full list in the complete paper: https://tomesphere.com/paper/1906.02494/full.md

---
Source: https://tomesphere.com/paper/1906.02494