Influence-Driven Explanations for Bayesian Network Classifiers
Antonio Rago, Emanuele Albini, Pietro Baroni, Francesca Toni

TL;DR
This paper introduces influence-driven explanations (IDXs) for Bayesian network classifiers, enhancing transparency by systematically incorporating causal influences and logical properties to generate context-aware, interpretable explanations.
Contribution
It presents a novel influence-based explanation framework for Bayesian classifiers that guarantees interpretability and can be tailored to user and context requirements.
Findings
IDXs outperform heuristic methods in explanation quality
IDXs can explain various Bayesian classifier types
The approach integrates and improves upon existing explanation methods
Abstract
One of the most pressing issues in AI in recent years has been the need to address the lack of explainability of many of its models. We focus on explanations for discrete Bayesian network classifiers (BCs), targeting greater transparency of their inner workings by including intermediate variables in explanations, rather than just the input and output variables as is standard practice. The proposed influence-driven explanations (IDXs) for BCs are systematically generated using the causal relationships between variables within the BC, called influences, which are then categorised by logical requirements, called relation properties, according to their behaviour. These relation properties both provide guarantees beyond heuristic explanation methods and allow the information underpinning an explanation to be tailored to a particular context's and user's requirements, e.g., IDXs may be…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Modeling and Causal Inference · Explainable Artificial Intelligence (XAI) · Adversarial Robustness in Machine Learning
