PredDiff: Explanations and Interactions from Conditional Expectations

Stefan Bl\"ucher; Johanna Vielhaben; Nils Strodthoff

arXiv:2102.13519·cs.LG·July 12, 2023

PredDiff: Explanations and Interactions from Conditional Expectations

Stefan Bl\"ucher, Johanna Vielhaben, Nils Strodthoff

PDF

Open Access 2 Repos

TL;DR

PredDiff is a theoretically grounded, model-agnostic method for local feature attribution and interaction detection, extending existing approaches with a new measure for feature interactions, applicable to both classification and regression tasks.

Contribution

The paper introduces a new, well-founded interaction measure within PredDiff, enhancing its ability to explain complex models and distinguish between classification and regression.

Findings

01

PredDiff provides reliable, numerically inexpensive attributions.

02

The new interaction measure captures complex feature interactions.

03

PredDiff's connection to Shapley values clarifies its theoretical foundation.

Abstract

PredDiff is a model-agnostic, local attribution method that is firmly rooted in probability theory. Its simple intuition is to measure prediction changes while marginalizing features. In this work, we clarify properties of PredDiff and its close connection to Shapley values. We stress important differences between classification and regression, which require a specific treatment within both formalisms. We extend PredDiff by introducing a new, well-founded measure for interaction effects between arbitrary feature subsets. The study of interaction effects represents an inevitable step towards a comprehensive understanding of black-box models and is particularly important for science applications. Equipped with our novel interaction measure, PredDiff is a promising model-agnostic approach for obtaining reliable, numerically inexpensive and theoretically sound attributions.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsExplainable Artificial Intelligence (XAI) · Generative Adversarial Networks and Image Synthesis · Adversarial Robustness in Machine Learning