Informative Perturbation Selection for Uncertainty-Aware Post-hoc Explanations
Sumedha Chugh, Ranjitha Prasad, Nazreen Shah

TL;DR
EAGLE is a novel explanation method that adaptively selects perturbations using information theory to improve the reliability, stability, and confidence of post-hoc explanations for black-box models.
Contribution
It introduces an active learning framework for perturbation selection in post-hoc explanations, enhancing explanation quality and providing uncertainty estimates.
Findings
EAGLE achieves higher explanation reproducibility.
It improves neighborhood stability over existing methods.
Sample complexity scales as O(d log t).
Abstract
Trust and ethical concerns due to the widespread deployment of opaque machine learning (ML) models motivating the need for reliable model explanations. Post-hoc model-agnostic explanation methods addresses this challenge by learning a surrogate model that approximates the behavior of the deployed black-box ML model in the locality of a sample of interest. In post-hoc scenarios, neither the underlying model parameters nor the training are available, and hence, this local neighborhood must be constructed by generating perturbed inputs in the neighborhood of the sample of interest, and its corresponding model predictions. We propose \emph{Expected Active Gain for Local Explanations} (\texttt{EAGLE}), a post-hoc model-agnostic explanation framework that formulates perturbation selection as an information-theoretic active learning problem. By adaptively sampling perturbations that maximize…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Adversarial Robustness in Machine Learning · Artificial Intelligence in Healthcare and Education
