NoMatterXAI: Generating "No Matter What" Alterfactual Examples for Explaining Black-Box Text Classification Models
Tuc Nguyen, James Michels, Hua Shen, Thai Le

TL;DR
This paper introduces NoMatterXAI, an algorithm for generating alterfactual explanations in text classification, which systematically tests model robustness against irrelevant feature changes to improve interpretability.
Contribution
It formulates alterfactual example generation as an optimization problem and proposes a novel method for creating high-fidelity, context-preserving alterfactuals in text classification.
Findings
Achieves up to 95% fidelity in alterfactual generation
Maintains over 90% context similarity across models and datasets
Human study confirms effectiveness of explanations
Abstract
In Explainable AI (XAI), counterfactual explanations (CEs) are a well-studied method to communicate feature relevance through contrastive reasoning of "what if" to explain AI models' predictions. However, they only focus on important (i.e., relevant) features and largely disregard less important (i.e., irrelevant) ones. Such irrelevant features can be crucial in many applications, especially when users need to ensure that an AI model's decisions are not affected or biased against specific attributes such as gender, race, religion, or political affiliation. To address this gap, the concept of alterfactual explanations (AEs) has been proposed. AEs explore an alternative reality of "no matter what", where irrelevant features are substituted with alternative features (e.g., "republicans" -> "democrats") within the same attribute (e.g., "politics") while maintaining a similar prediction…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Explainable Artificial Intelligence (XAI)
MethodsAutoencoders · Focus
