xRAI: Explainable Representations through AI
Christiann Bartelt, Sascha Marton, Heiner Stuckenschmidt

TL;DR
xRAI introduces a method to extract symbolic mathematical representations from trained neural networks by training interpretation networks that translate network weights into explicit functions, aiding interpretability.
Contribution
This work proposes a novel approach using interpretation networks to derive explicit symbolic functions from neural network weights, enhancing understanding of neural decision processes.
Findings
Interpretation networks can be trained efficiently on synthetic data.
The approach successfully extracts symbolic representations for Boolean functions and polynomials.
Results show promising quality in the generated symbolic functions.
Abstract
We present xRAI an approach for extracting symbolic representations of the mathematical function a neural network was supposed to learn from the trained network. The approach is based on the idea of training a so-called interpretation network that receives the weights and biases of the trained network as input and outputs the numerical representation of the function the network was supposed to learn that can be directly translated into a symbolic representation. We show that interpretation nets for different classes of functions can be trained on synthetic data offline using Boolean functions and low-order polynomials as examples. We show that the training is rather efficient and the quality of the results are promising. Our work aims to provide a contribution to the problem of better understanding neural decision making by making the target function explicit
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Machine Learning and Data Classification · Bayesian Modeling and Causal Inference
