A Functional Information Perspective on Model Interpretation

Itai Gat; Nitay Calderon; Roi Reichart; Tamir Hazan

arXiv:2206.05700·cs.LG·June 15, 2022·1 cites

A Functional Information Perspective on Model Interpretation

Itai Gat, Nitay Calderon, Roi Reichart, Tamir Hazan

PDF

Open Access 1 Repo

TL;DR

This paper introduces a theoretical framework for model interpretability based on functional entropy and Fisher information, providing a principled way to quantify feature contributions in complex models.

Contribution

It proposes a novel interpretability method grounded in information theory, leveraging the log-Sobolev inequality to measure feature importance.

Findings

01

Outperforms existing sampling-based interpretability methods

02

Effective across image, text, and audio data

03

Provides a theoretical basis for feature contribution measurement

Abstract

Contemporary predictive models are hard to interpret as their deep nets exploit numerous complex relations between input elements. This work suggests a theoretical framework for model interpretability by measuring the contribution of relevant features to the functional entropy of the network with respect to the input. We rely on the log-Sobolev inequality that bounds the functional entropy by the functional Fisher information with respect to the covariance of the data. This provides a principled way to measure the amount of information contribution of a subset of features to the decision function. Through extensive experiments, we show that our method surpasses existing interpretability sampling-based methods on various data signals such as image, text, and audio.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

nitaytech/functionalexplanation
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsExplainable Artificial Intelligence (XAI) · Adversarial Robustness in Machine Learning · Neural Networks and Applications