Probing for targeted syntactic knowledge through grammatical error   detection

Christopher Davis; Christopher Bryant; Andrew Caines; Marek Rei; Paula; Buttery

arXiv:2210.16228·cs.CL·October 31, 2022

Probing for targeted syntactic knowledge through grammatical error detection

Christopher Davis, Christopher Bryant, Andrew Caines, Marek Rei, Paula, Buttery

PDF

Open Access 1 Repo

TL;DR

This paper investigates whether pre-trained language models can reliably detect subject-verb agreement errors, revealing that while some models encode relevant syntactic info, their performance varies across training data and constructions.

Contribution

The study introduces grammatical error detection as a diagnostic tool to assess syntactic knowledge in language models, highlighting limitations in robustness and consistency.

Findings

01

Masked language models encode SVA information linearly.

02

Autoregressive models perform similarly to baseline.

03

Performance varies with training data and syntactic constructions.

Abstract

Targeted studies testing knowledge of subject-verb agreement (SVA) indicate that pre-trained language models encode syntactic information. We assert that if models robustly encode subject-verb agreement, they should be able to identify when agreement is correct and when it is incorrect. To that end, we propose grammatical error detection as a diagnostic probe to evaluate token-level contextual representations for their knowledge of SVA. We evaluate contextual representations at each layer from five pre-trained English language models: BERT, XLNet, GPT-2, RoBERTa, and ELECTRA. We leverage public annotated training data from both English second language learners and Wikipedia edits, and report results on manually crafted stimuli for subject-verb agreement. We find that masked language models linearly encode information relevant to the detection of SVA errors, while the autoregressive…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

chrisdavis90/ged-syntax-probing
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Text Readability and Simplification

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Multi-Head Attention · Attention Is All You Need · Cosine Annealing · Byte Pair Encoding · SentencePiece · Linear Warmup With Cosine Annealing · Discriminative Fine-Tuning · Linear Warmup With Linear Decay · Attention Dropout