Formally Explaining Neural Networks within Reactive Systems
Shahaf Bassan, Guy Amir, Davide Corsi, Idan Refaeli, Guy Katz

TL;DR
This paper introduces a formal, verification-based explainability method for neural networks within reactive systems, providing reliable, minimal explanations that outperform existing heuristic approaches in efficiency and accuracy.
Contribution
It presents a novel formal XAI technique tailored for multi-step reactive systems, leveraging DNN verification to produce trustworthy explanations with efficiency improvements.
Findings
Outperforms state-of-the-art in explanation efficiency
Produces more reliable, formal explanations
Effective on automated navigation benchmarks
Abstract
Deep neural networks (DNNs) are increasingly being used as controllers in reactive systems. However, DNNs are highly opaque, which renders it difficult to explain and justify their actions. To mitigate this issue, there has been a surge of interest in explainable AI (XAI) techniques, capable of pinpointing the input features that caused the DNN to act as it did. Existing XAI techniques typically face two limitations: (i) they are heuristic, and do not provide formal guarantees that the explanations are correct; and (ii) they often apply to ``one-shot'' systems, where the DNN is invoked independently of past invocations, as opposed to reactive systems. Here, we begin bridging this gap, and propose a formal DNN-verification-based XAI technique for reasoning about multi-step, reactive systems. We suggest methods for efficiently calculating succinct explanations, by exploiting the system's…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Adversarial Robustness in Machine Learning
