Causal Analysis of Syntactic Agreement Mechanisms in Neural Language   Models

Matthew Finlayson; Aaron Mueller; Sebastian Gehrmann; Stuart Shieber,; Tal Linzen; Yonatan Belinkov

arXiv:2106.06087·cs.CL·June 23, 2021

Causal Analysis of Syntactic Agreement Mechanisms in Neural Language Models

Matthew Finlayson, Aaron Mueller, Sebastian Gehrmann, Stuart Shieber,, Tal Linzen, Yonatan Belinkov

PDF

1 Repo

TL;DR

This paper uses causal mediation analysis to explore how neural language models perform subject-verb agreement, revealing different mechanisms and neuron usage depending on sentence structure and model size.

Contribution

It introduces causal mediation analysis to understand the mechanisms behind subject-verb agreement in neural language models, highlighting differences across architectures and sentence structures.

Findings

01

Larger models do not necessarily have stronger grammatical preferences.

02

Two distinct mechanisms for subject-verb agreement depend on sentence structure.

03

Models rely on similar neurons for sentences with similar syntax.

Abstract

Targeted syntactic evaluations have demonstrated the ability of language models to perform subject-verb agreement given difficult contexts. To elucidate the mechanisms by which the models accomplish this behavior, this study applies causal mediation analysis to pre-trained neural language models. We investigate the magnitude of models' preferences for grammatical inflections, as well as whether neurons process subject-verb agreement similarly across sentences with different syntactic structures. We uncover similarities and differences across architectures and model sizes -- notably, that larger models do not necessarily learn stronger preferences. We also observe two distinct mechanisms for producing subject-verb agreement depending on the syntactic structure of the input sentence. Finally, we find that language models rely on similar sets of neurons when given sentences with similar…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

mattf1n/lm-intervention
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.