Explainable Adversarial Attacks in Deep Neural Networks Using Activation   Profiles

Gabriel D. Cantareira; Rodrigo F. Mello; Fernando V. Paulovich

arXiv:2103.10229·cs.LG·March 19, 2021

Explainable Adversarial Attacks in Deep Neural Networks Using Activation Profiles

Gabriel D. Cantareira, Rodrigo F. Mello, Fernando V. Paulovich

PDF

Open Access

TL;DR

This paper introduces a visual framework to analyze how neural networks perceive adversarial examples differently from regular data, aiding in identifying vulnerabilities and guiding improvements in model robustness.

Contribution

It presents a novel visualization method to investigate neural network responses to adversarial attacks, revealing differences in perception and aiding in vulnerability analysis.

Findings

01

Visual framework reveals perception differences between regular and adversarial data

02

Helps identify exploited vulnerabilities in neural network models

03

Guides improvements in training and architecture for robustness

Abstract

As neural networks become the tool of choice to solve an increasing variety of problems in our society, adversarial attacks become critical. The possibility of generating data instances deliberately designed to fool a network's analysis can have disastrous consequences. Recent work has shown that commonly used methods for model training often result in fragile abstract representations that are particularly vulnerable to such attacks. This paper presents a visual framework to investigate neural network models subjected to adversarial examples, revealing how models' perception of the adversarial data differs from regular data instances and their relationships with class perception. Through different use cases, we show how observing these elements can quickly pinpoint exploited areas in a model, allowing further study of vulnerable features in input data and serving as a guide to improving…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Explainable Artificial Intelligence (XAI) · Anomaly Detection Techniques and Applications