DISCO: DISCovering Overfittings as Causal Rules for Text Classification   Models

Zijian Zhang; Vinay Setty; Yumeng Wang; Avishek Anand

arXiv:2411.04649·cs.AI·November 8, 2024

DISCO: DISCovering Overfittings as Causal Rules for Text Classification Models

Zijian Zhang, Vinay Setty, Yumeng Wang, Avishek Anand

PDF

Open Access

TL;DR

DISCO is a novel method that uncovers causal, rule-based explanations for text classification models, revealing overfitting and spurious correlations to improve interpretability and model robustness.

Contribution

It introduces a scalable sequence mining approach to discover global, causal n-gram rules that explain model predictions and detect overfitting, surpassing existing interpretability methods.

Findings

01

Achieved 100% detection of manually inserted shortcuts in training data.

02

Identified an 18.8% performance regression due to overfitting.

03

Enabled interactive explanations to distinguish spurious from genuine features.

Abstract

With the rapid advancement of neural language models, the deployment of over-parameterized models has surged, increasing the need for interpretable explanations comprehensible to human inspectors. Existing post-hoc interpretability methods, which often focus on unigram features of single input textual instances, fail to capture the models' decision-making process fully. Additionally, many methods do not differentiate between decisions based on spurious correlations and those based on a holistic understanding of the input. Our paper introduces DISCO, a novel method for discovering global, rule-based explanations by identifying causal n-gram associations with model predictions. This method employs a scalable sequence mining technique to extract relevant text spans from training data, associate them with model predictions, and conduct causality checks to distill robust rules that elucidate…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsComputational and Text Analysis Methods

MethodsFocus