An Adversarial Approach for Explaining the Predictions of Deep Neural   Networks

Arash Rahnama; Andrew Tseng

arXiv:2005.10284·cs.LG·September 29, 2020

An Adversarial Approach for Explaining the Predictions of Deep Neural Networks

Arash Rahnama, Andrew Tseng

PDF

2 Repos

TL;DR

This paper introduces a novel, efficient adversarial-based method for explaining deep neural network predictions by analyzing input feature importance through adversarial attacks, enhancing interpretability across various models and datasets.

Contribution

The paper presents a fast, consistent, and easy-to-implement adversarial approach for explaining DNN predictions, demonstrating its generality and effectiveness across multiple tasks and datasets.

Findings

01

The approach provides consistent explanations for different inputs.

02

It is faster and more interpretable than existing methods.

03

Experimental results validate its effectiveness across various models.

Abstract

Machine learning models have been successfully applied to a wide range of applications including computer vision, natural language processing, and speech recognition. A successful implementation of these models however, usually relies on deep neural networks (DNNs) which are treated as opaque black-box systems due to their incomprehensible complexity and intricate internal mechanism. In this work, we present a novel algorithm for explaining the predictions of a DNN using adversarial machine learning. Our approach identifies the relative importance of input features in relation to the predictions based on the behavior of an adversarial attack on the DNN. Our algorithm has the advantage of being fast, consistent, and easy to implement and interpret. We present our detailed analysis that demonstrates how the behavior of an adversarial attack, given a DNN and a task, stays consistent for…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.