Multilingual Multi-Aspect Explainability Analyses on Machine Reading   Comprehension Models

Yiming Cui; Wei-Nan Zhang; Wanxiang Che; Ting Liu; Zhigang Chen,; Shijin Wang

arXiv:2108.11574·cs.CL·October 29, 2024

Multilingual Multi-Aspect Explainability Analyses on Machine Reading Comprehension Models

Yiming Cui, Wei-Nan Zhang, Wanxiang Che, Ting Liu, Zhigang Chen,, Shijin Wang

PDF

1 Repo

TL;DR

This paper investigates the internal attention mechanisms of multilingual pre-trained language models in machine reading comprehension, revealing key attention patterns linked to model performance and enhancing explainability.

Contribution

It provides a multilingual analysis of attention mechanisms in PLMs for MRC, highlighting the importance of passage-to-question and passage understanding attentions.

Findings

01

Passage-to-question and passage understanding attentions are most crucial.

02

Strong correlation between certain attentions and model performance.

03

Visualizations and case studies reveal common attention patterns.

Abstract

Achieving human-level performance on some of the Machine Reading Comprehension (MRC) datasets is no longer challenging with the help of powerful Pre-trained Language Models (PLMs). However, the internal mechanism of these artifacts remains unclear, placing an obstacle for further understanding these models. This paper focuses on conducting a series of analytical experiments to examine the relations between the multi-head self-attention and the final MRC system performance, revealing the potential explainability in PLM-based MRC models. To ensure the robustness of the analyses, we perform our experiments in a multilingual way on top of various PLMs. We discover that passage-to-question and passage understanding attentions are the most important ones in the question answering process, showing strong correlations to the final performance than other parts. Through comprehensive…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ymcui/mrc-model-analysis
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Dropout · Weight Decay · Adam · Residual Connection · LAMB · Refunds@Expedia|||How do I get a full refund from Expedia? · WordPiece