Causal Analysis of Syntactic Agreement Neurons in Multilingual Language Models
Aaron Mueller, Yu Xia, Tal Linzen

TL;DR
This paper causally investigates how multilingual and monolingual language models encode syntactic agreement, revealing neuron overlaps across languages, layer-specific patterns, and differences between model types.
Contribution
It introduces a causal probing method with counterfactual perturbations to analyze syntactic encoding in multilingual models, addressing limitations of prior correlational studies.
Findings
Neuron overlap across languages in autoregressive models
Distinct layer-wise effect patterns for syntactic agreement
Masked language models are more sensitive to syntactic information than behavioral analyses suggest
Abstract
Structural probing work has found evidence for latent syntactic information in pre-trained language models. However, much of this analysis has focused on monolingual models, and analyses of multilingual models have employed correlational methods that are confounded by the choice of probing tasks. In this study, we causally probe multilingual language models (XGLM and multilingual BERT) as well as monolingual BERT-based models across various languages; we do this by performing counterfactual perturbations on neuron activations and observing the effect on models' subject-verb agreement probabilities. We observe where in the model and to what extent syntactic agreement is encoded in each language. We find significant neuron overlap across languages in autoregressive multilingual language models, but not masked language models. We also find two distinct layer-wise effect patterns and two…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Text Readability and Simplification · Natural Language Processing Techniques
