Scaling Symbolic Methods using Gradients for Neural Model Explanation
Subham Sekhar Sahoo, Subhashini Venugopalan, Li Li, Rishabh Singh,, Patrick Riley

TL;DR
This paper introduces a scalable method combining gradient-based and symbolic techniques to generate interpretable saliency maps for neural networks, improving explanation quality on large datasets.
Contribution
It proposes a novel approach that integrates gradient information with SMT solvers to produce sparse, high-quality explanations for neural network predictions.
Findings
Generated sparser saliency maps with higher scores.
Scalable to large neural networks.
Effective on datasets like MNIST, ImageNet, and Beer Reviews.
Abstract
Symbolic techniques based on Satisfiability Modulo Theory (SMT) solvers have been proposed for analyzing and verifying neural network properties, but their usage has been fairly limited owing to their poor scalability with larger networks. In this work, we propose a technique for combining gradient-based methods with symbolic techniques to scale such analyses and demonstrate its application for model explanation. In particular, we apply this technique to identify minimal regions in an input that are most relevant for a neural network's prediction. Our approach uses gradient information (based on Integrated Gradients) to focus on a subset of neurons in the first layer, which allows our technique to scale to large networks. The corresponding SMT constraints encode the minimal input mask discovery problem such that after masking the input, the activations of the selected neurons are still…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Explainable Artificial Intelligence (XAI) · Machine Learning in Materials Science
