Explaining text classifiers through progressive neighborhood approximation with realistic samples
Yi Cai, Arthur Zimek, Eirini Ntoutsi, Gerhard Wunder

TL;DR
This paper introduces a progressive neighborhood approximation method for local text classifier explanations, utilizing counterfactuals and realistic samples to improve interpretability and fidelity.
Contribution
It proposes a novel two-stage interpolation approach and a probability-based alternative for generating realistic neighborhoods in local explanations of text classifiers.
Findings
Both methods produce more realistic and interpretable explanations.
Experimental results show improved explanation quality over existing approaches.
The approaches are effective across various datasets and models.
Abstract
The importance of neighborhood construction in local explanation methods has been already highlighted in the literature. And several attempts have been made to improve neighborhood quality for high-dimensional data, for example, texts, by adopting generative models. Although the generators produce more realistic samples, the intuitive sampling approaches in the existing solutions leave the latent space underexplored. To overcome this problem, our work, focusing on local model-agnostic explanations for text classifiers, proposes a progressive approximation approach that refines the neighborhood of a to-be-explained decision with a careful two-stage interpolation using counterfactuals as landmarks. We explicitly specify the two properties that should be satisfied by generative models, the reconstruction ability and the locality-preserving property, to guide the selection of generators for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Explainable Artificial Intelligence (XAI) · Generative Adversarial Networks and Image Synthesis
MethodsCounterfactuals Explanations
