Navigating the Shortcut Maze: A Comprehensive Analysis of Shortcut   Learning in Text Classification by Language Models

Yuqing Zhou; Ruixiang Tang; Ziyu Yao; Ziwei Zhu

arXiv:2409.17455·cs.CL·November 13, 2024

Navigating the Shortcut Maze: A Comprehensive Analysis of Shortcut Learning in Text Classification by Language Models

Yuqing Zhou, Ruixiang Tang, Ziyu Yao, Ziwei Zhu

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces a comprehensive benchmark to analyze how language models rely on complex shortcuts in text classification, revealing their vulnerabilities and resilience to subtle spurious correlations.

Contribution

It presents a new benchmark categorizing shortcuts into occurrence, style, and concept, enabling systematic evaluation of models' susceptibility to nuanced shortcuts.

Findings

01

Models vary in resilience to different shortcut types.

02

State-of-the-art models show some robustness but remain vulnerable.

03

Benchmark and code are publicly available for further research.

Abstract

Language models (LMs), despite their advances, often depend on spurious correlations, undermining their accuracy and generalizability. This study addresses the overlooked impact of subtler, more complex shortcuts that compromise model reliability beyond oversimplified shortcuts. We introduce a comprehensive benchmark that categorizes shortcuts into occurrence, style, and concept, aiming to explore the nuanced ways in which these shortcuts influence the performance of LMs. Through extensive experiments across traditional LMs, large language models, and state-of-the-art robust models, our research systematically investigates models' resilience and susceptibilities to sophisticated shortcuts. Our benchmark and code can be found at: https://github.com/yuqing-zhou/shortcut-learning-in-text-classification.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

yuqing-zhou/shortcut-learning-in-text-classification
noneOfficial

Videos

Navigating the Shortcut Maze: A Comprehensive Analysis of Shortcut Learning in Text Classification by Language Models· underline

Taxonomy

TopicsText and Document Classification Technologies · Natural Language Processing Techniques