New Perspective of Interpretability of Deep Neural Networks

Masanari Kimura; Masayuki Tanaka

arXiv:1909.07156·cs.LG·September 17, 2019·1 cites

New Perspective of Interpretability of Deep Neural Networks

Masanari Kimura, Masayuki Tanaka

PDF

Open Access

TL;DR

This paper proposes a new definition of interpretability for deep neural networks based on human predictability, emphasizing how easily humans can anticipate inference changes when the model is perturbed.

Contribution

It introduces the concept of human predictability as a measurable aspect of DNN interpretability, providing a clearer framework for understanding and improving model transparency.

Findings

01

Defined human predictability as ease of predicting inference changes

02

Presented an example of a highly human-predictable DNN

03

Discussed implications for interpretability research

Abstract

Deep neural networks (DNNs) are known as black-box models. In other words, it is difficult to interpret the internal state of the model. Improving the interpretability of DNNs is one of the hot research topics. However, at present, the definition of interpretability for DNNs is vague, and the question of what is a highly explanatory model is still controversial. To address this issue, we provide the definition of the human predictability of the model, as a part of the interpretability of the DNNs. The human predictability proposed in this paper is defined by easiness to predict the change of the inference when perturbating the model of the DNNs. In addition, we introduce one example of high human-predictable DNNs. We discuss that our definition will help to the research of the interpretability of the DNNs considering various types of applications.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsExplainable Artificial Intelligence (XAI) · Adversarial Robustness in Machine Learning · Machine Learning and Data Classification

MethodsInterpretability