Interpreting Neural Networks Using Flip Points

Roozbeh Yousefzadeh; Dianne P. O'Leary

arXiv:1903.08789·cs.LG·March 22, 2019·6 cites

Interpreting Neural Networks Using Flip Points

Roozbeh Yousefzadeh, Dianne P. O'Leary

PDF

Open Access

TL;DR

This paper introduces flip points as a novel interpretability technique for neural networks, enabling better understanding of model decisions, confidence, influential features, and bias, applicable across different models.

Contribution

The paper presents a new method using flip points to interpret neural networks, measure confidence, identify influential features, and analyze biases, with a model-agnostic approach.

Findings

01

Flip points help interpret neural network outputs.

02

Distance to flip points measures confidence.

03

Directions from training data to flip points explain feature importance.

Abstract

Neural networks have been criticized for their lack of easy interpretation, which undermines confidence in their use for important applications. Here, we introduce a novel technique, interpreting a trained neural network by investigating its flip points. A flip point is any point that lies on the boundary between two output classes: e.g. for a neural network with a binary yes/no output, a flip point is any input that generates equal scores for "yes" and "no". The flip point closest to a given input is of particular importance, and this point is the solution to a well-posed optimization problem. This paper gives an overview of the uses of flip points and how they are computed. Through results on standard datasets, we demonstrate how flip points can be used to provide detailed interpretation of the output produced by a neural network. Moreover, for a given input, flip points enable us to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Explainable Artificial Intelligence (XAI) · Machine Learning and Data Classification

MethodsSoftmax