SpArX: Sparse Argumentative Explanations for Neural Networks [Technical Report]
Hamed Ayoobi, Nico Potyka, Francesca Toni

TL;DR
SpArX introduces a novel method that creates faithful, interpretable explanations of neural network decision processes by sparsifying MLPs and translating them into argumentation frameworks, enhancing understanding of their mechanics.
Contribution
The paper presents SpArX, a new approach that combines sparsification and argumentation frameworks to generate more faithful and insightful explanations of neural networks' decision-making.
Findings
SpArX provides more faithful explanations than existing methods.
It offers both global and local interpretability of MLPs.
Experimental results demonstrate improved insight into neural network reasoning.
Abstract
Neural networks (NNs) have various applications in AI, but explaining their decisions remains challenging. Existing approaches often focus on explaining how changing individual inputs affects NNs' outputs. However, an explanation that is consistent with the input-output behaviour of an NN is not necessarily faithful to the actual mechanics thereof. In this paper, we exploit relationships between multi-layer perceptrons (MLPs) and quantitative argumentation frameworks (QAFs) to create argumentative explanations for the mechanics of MLPs. Our SpArX method first sparsifies the MLP while maintaining as much of the original structure as possible. It then translates the sparse MLP into an equivalent QAF to shed light on the underlying decision process of the MLP, producing global and/or local explanations. We demonstrate experimentally that SpArX can give more faithful explanations than…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Machine Learning in Materials Science · Topic Modeling
