Explaining Neural Networks without Access to Training Data
Sascha Marton, Stefan L\"udtke, Christian Bartelt, Andrej Tschalzev,, Heiner Stuckenschmidt

TL;DR
This paper extends the $ ext{I}$-Net framework to generate explanations for neural networks without access to training data, using surrogate decision trees and realistic data distributions, outperforming traditional interpretability methods.
Contribution
It introduces a novel extension of $ ext{I}$-Nets for standard and soft decision trees, enabling data-free interpretability in real-world scenarios.
Findings
Outperforms traditional interpretability methods when training data is unavailable.
Successfully applies to standard and soft decision trees as surrogate models.
Enhances $ ext{I}$-Net applicability with realistic data distributions.
Abstract
We consider generating explanations for neural networks in cases where the network's training data is not accessible, for instance due to privacy or safety issues. Recently, -Nets have been proposed as a sample-free approach to post-hoc, global model interpretability that does not require access to training data. They formulate interpretation as a machine learning task that maps network representations (parameters) to a representation of an interpretable function. In this paper, we extend the -Net framework to the cases of standard and soft decision trees as surrogate models. We propose a suitable decision tree representation and design of the corresponding -Net output layers. Furthermore, we make -Nets applicable to real-world tasks by considering more realistic distributions when generating the -Net's training data. We…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Adversarial Robustness in Machine Learning · Bayesian Modeling and Causal Inference
