Counterfactual Interventions Reveal the Causal Effect of Relative Clause Representations on Agreement Prediction
Shauli Ravfogel, Grusha Prasad, Tal Linzen, Yoav Goldberg

TL;DR
This paper introduces AlterRep, a method using counterfactual interventions to causally analyze how BERT models process relative clauses, revealing that BERT representations align with English grammar and generalize across RC types.
Contribution
The paper presents AlterRep, a novel intervention-based approach to causally investigate linguistic feature representations in language models, specifically applied to relative clauses in BERT.
Findings
BERT uses RC boundary information consistent with English grammar.
RC boundary information generalizes across different RC types.
AlterRep effectively reveals causal effects of linguistic features in models.
Abstract
When language models process syntactically complex sentences, do they use their representations of syntax in a manner that is consistent with the grammar of the language? We propose AlterRep, an intervention-based method to address this question. For any linguistic feature of a given sentence, AlterRep generates counterfactual representations by altering how the feature is encoded, while leaving intact all other aspects of the original representation. By measuring the change in a model's word prediction behavior when these counterfactual representations are substituted for the original ones, we can draw conclusions about the causal effect of the linguistic feature in question on the model's behavior. We apply this method to study how BERT models of different sizes process relative clauses (RCs). We find that BERT variants use RC boundary information during word prediction in a manner…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Software Engineering Research
MethodsLinear Layer · Linear Warmup With Linear Decay · Layer Normalization · Softmax · Multi-Head Attention · Weight Decay · WordPiece · Attention Dropout · Dropout · Attention Is All You Need
