Optimal and efficient text counterfactuals using Graph Neural Networks

Dimitris Lymperopoulos; Maria Lymperaiou; Giorgos Filandrianos; Giorgos Stamou

arXiv:2408.01969·cs.CL·August 4, 2025

Optimal and efficient text counterfactuals using Graph Neural Networks

Dimitris Lymperopoulos, Maria Lymperaiou, Giorgos Filandrianos, Giorgos Stamou

PDF

Open Access 1 Repo

TL;DR

This paper introduces a graph neural network-based framework for generating fast, contrastive, and minimal counterfactual explanations in NLP tasks, enhancing model interpretability.

Contribution

It presents a novel GNN-based method for efficient and semantically meaningful counterfactual generation in NLP, outperforming existing approaches in speed and quality.

Findings

01

Generated counterfactuals are contrastive, fluent, and minimal.

02

The framework is significantly faster than state-of-the-art methods.

03

Effective on sentiment and topic classification tasks.

Abstract

As NLP models become increasingly integral to decision-making processes, the need for explainability and interpretability has become paramount. In this work, we propose a framework that achieves the aforementioned by generating semantically edited inputs, known as counterfactual interventions, which change the model prediction, thus providing a form of counterfactual explanations for the model. We test our framework on two NLP tasks - binary sentiment classification and topic classification - and show that the generated edits are contrastive, fluent and minimal, while the whole process remains significantly faster that other state-of-the-art counterfactual editors.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Jimlibo/GNN-Counterfactual-Editor
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpam and Phishing Detection · Advanced Malware Detection Techniques · Hate Speech and Cyberbullying Detection