ABLE: Using Adversarial Pairs to Construct Local Models for Explaining Model Predictions
Krishna Khadka, Sunny Shree, Pujan Budhathoki, Yu Lei, Raghu Kacker, D. Richard Kuhn

TL;DR
ABLE introduces a novel local explanation method that uses adversarial pairs to better approximate decision boundaries, improving stability and fidelity over existing techniques like LIME.
Contribution
The paper presents a new approach called ABLE that leverages adversarial pairs to construct more stable and accurate local explanations for complex models.
Findings
Achieves higher stability than LIME.
Provides better local fidelity in explanations.
Demonstrates effectiveness across multiple datasets and architectures.
Abstract
Machine learning models are increasingly used in critical applications but are mostly "black boxes" due to their lack of transparency. Local explanation approaches, such as LIME, address this issue by approximating the behavior of complex models near a test instance using simple, interpretable models. However, these approaches often suffer from instability and poor local fidelity. In this paper, we propose a novel approach called Adversarially Bracketed Local Explanation (ABLE) to address these limitations. Our approach first generates a set of neighborhood points near the test instance, x_test, by adding bounded Gaussian noise. For each neighborhood point D, we apply an adversarial attack to generate an adversarial point A with minimal perturbation that results in a different label than D. A second adversarial attack is then performed on A to generate a point A' that has the same label…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Explainable Artificial Intelligence (XAI) · Generative Adversarial Networks and Image Synthesis
