Explore Spurious Correlations at the Concept Level in Language Models   for Text Classification

Yuhang Zhou; Paiheng Xu; Xiaoyu Liu; Bang An; Wei Ai; Furong Huang

arXiv:2311.08648·cs.CL·June 18, 2024·1 cites

Explore Spurious Correlations at the Concept Level in Language Models for Text Classification

Yuhang Zhou, Paiheng Xu, Xiaoyu Liu, Bang An, Wei Ai, Furong Huang

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper investigates how language models rely on spurious concept-level correlations in text classification, using ChatGPT to identify concepts and generate counterfactual data to improve robustness against such biases.

Contribution

It introduces a novel approach employing ChatGPT for concept labeling and counterfactual data generation to detect and mitigate concept-level spurious correlations in language models.

Findings

01

ChatGPT effectively assigns concept labels to texts.

02

Counterfactual data reduces spurious correlations.

03

Method outperforms token removal approaches.

Abstract

Language models (LMs) have achieved notable success in numerous NLP tasks, employing both fine-tuning and in-context learning (ICL) methods. While language models demonstrate exceptional performance, they face robustness challenges due to spurious correlations arising from imbalanced label distributions in training data or ICL exemplars. Previous research has primarily concentrated on word, phrase, and syntax features, neglecting the concept level, often due to the absence of concept labels and difficulty in identifying conceptual content in input texts. This paper introduces two main contributions. First, we employ ChatGPT to assign concept labels to texts, assessing concept bias in models during fine-tuning or ICL on test data. We find that LMs, when encountering spurious correlations between a concept and a label in training or prompts, resort to shortcuts for predictions. Second, we…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

tonyzhou98/concept-spurious-correlation
pytorchOfficial

Videos

Explore Spurious Correlations at the Concept Level in Language Models for Text Classification· underline

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Sentiment Analysis and Opinion Mining

MethodsFocus