Optimal and efficient text counterfactuals using Graph Neural Networks
Dimitris Lymperopoulos, Maria Lymperaiou, Giorgos Filandrianos, Giorgos Stamou

TL;DR
This paper introduces a graph neural network-based framework for generating fast, contrastive, and minimal counterfactual explanations in NLP tasks, enhancing model interpretability.
Contribution
It presents a novel GNN-based method for efficient and semantically meaningful counterfactual generation in NLP, outperforming existing approaches in speed and quality.
Findings
Generated counterfactuals are contrastive, fluent, and minimal.
The framework is significantly faster than state-of-the-art methods.
Effective on sentiment and topic classification tasks.
Abstract
As NLP models become increasingly integral to decision-making processes, the need for explainability and interpretability has become paramount. In this work, we propose a framework that achieves the aforementioned by generating semantically edited inputs, known as counterfactual interventions, which change the model prediction, thus providing a form of counterfactual explanations for the model. We test our framework on two NLP tasks - binary sentiment classification and topic classification - and show that the generated edits are contrastive, fluent and minimal, while the whole process remains significantly faster that other state-of-the-art counterfactual editors.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpam and Phishing Detection · Advanced Malware Detection Techniques · Hate Speech and Cyberbullying Detection
