Combining Feature and Instance Attribution to Detect Artifacts

Pouya Pezeshkpour; Sarthak Jain; Sameer Singh; Byron C. Wallace

arXiv:2107.00323·cs.CL·March 29, 2022

Combining Feature and Instance Attribution to Detect Artifacts

Pouya Pezeshkpour, Sarthak Jain, Sameer Singh, Byron C. Wallace

PDF

Open Access

TL;DR

This paper introduces hybrid attribution methods combining feature and instance attribution to identify artifacts in training data for NLP models, improving artifact detection and aiding researchers.

Contribution

It proposes novel hybrid attribution techniques that effectively uncover training data artifacts and evaluates their practical usefulness through a user study.

Findings

01

Hybrid attribution methods successfully identify artifacts in training data.

02

The proposed approaches outperform existing attribution techniques in artifact detection.

03

User study indicates these methods are helpful for NLP researchers.

Abstract

Training the deep neural networks that dominate NLP requires large datasets. These are often collected automatically or via crowdsourcing, and may exhibit systematic biases or annotation artifacts. By the latter we mean spurious correlations between inputs and outputs that do not represent a generally held causal relationship between features and classes; models that exploit such correlations may appear to perform a given task well, but fail on out of sample data. In this paper we evaluate use of different attribution methods for aiding identification of training data artifacts. We propose new hybrid approaches that combine saliency maps (which highlight important input features) with instance attribution methods (which retrieve training samples influential to a given prediction). We show that this proposed training-feature attribution can be used to efficiently uncover artifacts in…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Domain Adaptation and Few-Shot Learning · Machine Learning and Data Classification