LLMs for Generating and Evaluating Counterfactuals: A Comprehensive   Study

Van Bach Nguyen; Paul Youssef; Christin Seifert; J\"org Schl\"otterer

arXiv:2405.00722·cs.CL·November 13, 2024

LLMs for Generating and Evaluating Counterfactuals: A Comprehensive Study

Van Bach Nguyen, Paul Youssef, Christin Seifert, J\"org Schl\"otterer

PDF

Open Access 1 Repo 1 Video

TL;DR

This study evaluates how well Large Language Models generate and assess counterfactual explanations in NLP, revealing their strengths in fluency but limitations in minimality and label-flipping, with implications for data augmentation and model interpretability.

Contribution

It provides a comprehensive comparison of LLMs in generating and evaluating counterfactuals for NLP tasks, highlighting their capabilities and limitations.

Findings

01

LLMs generate fluent counterfactuals but struggle with minimal changes.

02

Generating counterfactuals for Sentiment Analysis is easier than for NLI.

03

LLMs show bias towards original labels, affecting evaluation and augmentation.

Abstract

As NLP models become more complex, understanding their decisions becomes more crucial. Counterfactuals (CFs), where minimal changes to inputs flip a model's prediction, offer a way to explain these models. While Large Language Models (LLMs) have shown remarkable performance in NLP tasks, their efficacy in generating high-quality CFs remains uncertain. This work fills this gap by investigating how well LLMs generate CFs for two NLU tasks. We conduct a comprehensive comparison of several common LLMs, and evaluate their CFs, assessing both intrinsic metrics, and the impact of these CFs on data augmentation. Moreover, we analyze differences between human and LLM-generated CFs, providing insights for future research directions. Our results show that LLMs generate fluent CFs, but struggle to keep the induced changes minimal. Generating CFs for Sentiment Analysis (SA) is less challenging than…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

aix-group/llms-for-cfs
pytorchOfficial

Videos

LLMs for Generating and Evaluating Counterfactuals: A Comprehensive Study· underline

Taxonomy

TopicsStatistical and Computational Modeling

MethodsCounterfactuals Explanations · FLIP