An Interpretability Evaluation Benchmark for Pre-trained Language Models

Yaozong Shen; Lijie Wang; Ying Chen; Xinyan Xiao; Jing Liu; Hua Wu

arXiv:2207.13948·cs.CL·July 29, 2022·1 cites

An Interpretability Evaluation Benchmark for Pre-trained Language Models

Yaozong Shen, Lijie Wang, Ying Chen, Xinyan Xiao, Jing Liu, Hua Wu

PDF

Open Access

TL;DR

This paper introduces a comprehensive benchmark for evaluating pre-trained language models across multiple interpretability dimensions, including grammar, semantics, knowledge, reasoning, and computation, with annotated rationales and perturbation-based faithfulness metrics.

Contribution

It provides the first multi-dimensional interpretability evaluation benchmark with token-level rationales and perturbation-based faithfulness metrics for pre-trained language models.

Findings

01

Pre-trained LMs perform poorly on knowledge and computation dimensions.

02

Models show low plausibility in interpretability across all dimensions.

03

Models lack robustness on syntax-aware data.

Abstract

While pre-trained language models (LMs) have brought great improvements in many NLP tasks, there is increasing attention to explore capabilities of LMs and interpret their predictions. However, existing works usually focus only on a certain capability with some downstream tasks. There is a lack of datasets for directly evaluating the masked word prediction performance and the interpretability of pre-trained LMs. To fill in the gap, we propose a novel evaluation benchmark providing with both English and Chinese annotated data. It tests LMs abilities in multiple dimensions, i.e., grammar, semantics, knowledge, reasoning and computation. In addition, it provides carefully annotated token-level rationales that satisfy sufficiency and compactness. It contains perturbed instances for each original instance, so as to use the rationale consistency under perturbations as the metric for…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Explainable Artificial Intelligence (XAI)