Testing Pre-trained Language Models' Understanding of Distributivity via   Causal Mediation Analysis

Pangbo Ban; Yifan Jiang; Tianran Liu; Shane Steinert-Threlkeld

arXiv:2209.04761·cs.CL·October 20, 2022

Testing Pre-trained Language Models' Understanding of Distributivity via Causal Mediation Analysis

Pangbo Ban, Yifan Jiang, Tianran Liu, Shane Steinert-Threlkeld

PDF

Open Access 1 Repo

TL;DR

This paper investigates how well pre-trained language models understand distributivity semantics by introducing a new diagnostic dataset and using causal mediation analysis to explore their semantic comprehension mechanisms.

Contribution

The paper introduces DistNLI, a novel diagnostic dataset for distributivity, and applies causal mediation analysis to study model understanding and encoding of this semantic phenomenon.

Findings

01

Model understanding correlates with size and vocabulary

02

Insights into semantic encoding mechanisms

03

DistNLI enables targeted evaluation of distributivity comprehension

Abstract

To what extent do pre-trained language models grasp semantic knowledge regarding the phenomenon of distributivity? In this paper, we introduce DistNLI, a new diagnostic dataset for natural language inference that targets the semantic difference arising from distributivity, and employ the causal mediation analysis framework to quantify the model behavior and explore the underlying mechanism in this semantically-related task. We find that the extent of models' understanding is associated with model size and vocabulary size. We also provide insights into how models encode such high-level semantic knowledge.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

1fanj/CMA-distributivity
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Machine Learning in Healthcare · Explainable Artificial Intelligence (XAI)