Zero-shot LLM-guided Counterfactual Generation: A Case Study on NLP   Model Evaluation

Amrita Bhattacharjee; Raha Moraffah; Joshua Garland; Huan Liu

arXiv:2405.04793·cs.CL·November 20, 2024·1 cites

Zero-shot LLM-guided Counterfactual Generation: A Case Study on NLP Model Evaluation

Amrita Bhattacharjee, Raha Moraffah, Joshua Garland, Huan Liu

PDF

Open Access 1 Repo

TL;DR

This paper investigates using large language models in a zero-shot setting to generate counterfactual examples for stress-testing and explaining NLP models, avoiding the need for task-specific fine-tuning.

Contribution

It introduces a structured pipeline leveraging LLMs for zero-shot counterfactual generation, demonstrating its effectiveness across multiple NLP tasks without additional training.

Findings

01

LLMs can generate high-quality counterfactuals in zero-shot settings

02

The approach effectively stresses and explains black-box NLP models

03

Zero-shot counterfactuals outperform some fine-tuned methods in certain tasks

Abstract

With the development and proliferation of large, complex, black-box models for solving many natural language processing (NLP) tasks, there is also an increasing necessity of methods to stress-test these models and provide some degree of interpretability or explainability. While counterfactual examples are useful in this regard, automated generation of counterfactuals is a data and resource intensive process. such methods depend on models such as pre-trained language models that are then fine-tuned on auxiliary, often task-specific datasets, that may be infeasible to build in practice, especially for new tasks and data domains. Therefore, in this work we explore the possibility of leveraging large language models (LLMs) for zero-shot counterfactual generation in order to stress-test NLP models. We propose a structured pipeline to facilitate this generation, and we hypothesize that the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

AmritaBh/zero-shot-llm-counterfactual
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Malware Detection Techniques · Digital Media Forensic Detection · Digital and Cyber Forensics

MethodsFocus · Counterfactuals Explanations