An Empirical Comparison of Instance Attribution Methods for NLP

Pouya Pezeshkpour; Sarthak Jain; Byron C. Wallace; Sameer Singh

arXiv:2104.04128·cs.CL·April 12, 2021·1 cites

An Empirical Comparison of Instance Attribution Methods for NLP

Pouya Pezeshkpour, Sarthak Jain, Byron C. Wallace, Sameer Singh

PDF

Open Access 1 Repo

TL;DR

This paper empirically compares various instance attribution methods for NLP, evaluating their agreement and effectiveness in identifying training data influencing model predictions.

Contribution

It provides a systematic evaluation of simple retrieval versus gradient-based attribution methods, highlighting their similarities and differences in NLP tasks.

Findings

01

Simple retrieval methods often identify different training instances than influence functions.

02

Despite differences, simple methods show desirable characteristics similar to complex attribution techniques.

03

Gradient-based methods may offer more precise attribution but are computationally expensive.

Abstract

Widespread adoption of deep models has motivated a pressing need for approaches to interpret network outputs and to facilitate model debugging. Instance attribution methods constitute one means of accomplishing these goals by retrieving training instances that (may have) led to a particular prediction. Influence functions (IF; Koh and Liang 2017) provide machinery for doing this by quantifying the effect that perturbing individual train instances would have on a specific test prediction. However, even approximating the IF is computationally expensive, to the degree that may be prohibitive in many cases. Might simpler approaches (e.g., retrieving train examples most similar to a given test point) perform comparably? In this work, we evaluate the degree to which different potential instance attribution agree with respect to the importance of training samples. We find that simple retrieval…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

successar/instance_attributions_NLP
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Explainable Artificial Intelligence (XAI) · Machine Learning and Data Classification