Goodhart's Law Applies to NLP's Explanation Benchmarks

Jennifer Hsia; Danish Pruthi; Aarti Singh; Zachary C. Lipton

arXiv:2308.14272·cs.CL·August 29, 2023

Goodhart's Law Applies to NLP's Explanation Benchmarks

Jennifer Hsia, Danish Pruthi, Aarti Singh, Zachary C. Lipton

PDF

Open Access

TL;DR

This paper critically examines NLP explanation benchmarks, revealing they can be manipulated without changing model predictions, which questions their reliability for guiding explainability research.

Contribution

It demonstrates that existing explanation metrics like ERASER and EVAL-X can be arbitrarily inflated, exposing their limitations and prompting a reassessment of evaluation standards.

Findings

01

Metrics can be inflated without changing model predictions

02

Current benchmarks are vulnerable to simple manipulations

03

Results question the reliability of explanation metrics

Abstract

Despite the rising popularity of saliency-based explanations, the research community remains at an impasse, facing doubts concerning their purpose, efficacy, and tendency to contradict each other. Seeking to unite the community's efforts around common goals, several recent works have proposed evaluation metrics. In this paper, we critically examine two sets of metrics: the ERASER metrics (comprehensiveness and sufficiency) and the EVAL-X metrics, focusing our inquiry on natural language processing. First, we show that we can inflate a model's comprehensiveness and sufficiency scores dramatically without altering its predictions or explanations on in-distribution test inputs. Our strategy exploits the tendency for extracted explanations and their complements to be "out-of-support" relative to each other and in-distribution inputs. Next, we demonstrate that the EVAL-X metrics can be…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsExplainable Artificial Intelligence (XAI) · Topic Modeling · Scientific Computing and Data Management