Towards Explaining Adversarial Examples Phenomenon in Artificial Neural   Networks

Ramin Barati; Reza Safabakhsh; Mohammad Rahmati

arXiv:2107.10599·cs.LG·May 27, 2022

Towards Explaining Adversarial Examples Phenomenon in Artificial Neural Networks

Ramin Barati, Reza Safabakhsh, Mohammad Rahmati

PDF

TL;DR

This paper investigates the existence of adversarial examples in neural networks and proposes a theoretical framework based on convergence concepts to explain their phenomenon, unifying and extending previous explanations.

Contribution

It introduces a convergence-based theoretical explanation for adversarial examples and training, connecting attack objectives with learning theory, and demonstrates its practical applicability.

Findings

01

Pointwise convergence relates to adversarial phenomena.

02

The framework unifies existing explanations.

03

Experimental results validate the theory's relevance.

Abstract

In this paper, we study the adversarial examples existence and adversarial training from the standpoint of convergence and provide evidence that pointwise convergence in ANNs can explain these observations. The main contribution of our proposal is that it relates the objective of the evasion attacks and adversarial training with concepts already defined in learning theory. Also, we extend and unify some of the other proposals in the literature and provide alternative explanations on the observations made in those proposals. Through different experiments, we demonstrate that the framework is valuable in the study of the phenomenon and is applicable to real-world problems.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.