Investigating Novel Verb Learning in BERT: Selectional Preference   Classes and Alternation-Based Syntactic Generalization

Tristan Thrush; Ethan Wilcox; and Roger Levy

arXiv:2011.02417·cs.CL·November 5, 2020

Investigating Novel Verb Learning in BERT: Selectional Preference Classes and Alternation-Based Syntactic Generalization

Tristan Thrush, Ethan Wilcox, and Roger Levy

PDF

1 Repo

TL;DR

This study evaluates BERT's ability to learn novel verbs with limited examples, focusing on syntactic alternations and selectional preferences, revealing robust generalization after minimal exposure.

Contribution

It introduces a novel few-shot learning paradigm for testing BERT's syntactic and semantic generalization with novel verbs.

Findings

01

BERT generalizes well after one or two examples

02

BERT shows a transitivity bias in verb behavior

03

Robust grammatical expectations are formed quickly

Abstract

Previous studies investigating the syntactic abilities of deep learning models have not targeted the relationship between the strength of the grammatical generalization and the amount of evidence to which the model is exposed during training. We address this issue by deploying a novel word-learning paradigm to test BERT's few-shot learning capabilities for two aspects of English verbs: alternations and classes of selectional preferences. For the former, we fine-tune BERT on a single frame in a verbal-alternation pair and ask whether the model expects the novel verb to occur in its sister frame. For the latter, we fine-tune BERT on an incomplete selectional network of verbal objects and ask whether it expects unattested but plausible verb/object pairs. We find that BERT makes robust grammatical generalizations after just one or two instances of a novel word in fine-tuning. For the verbal…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

TristanThrush/few-shot-lm-learning
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsLinear Layer · Softmax · Dense Connections · WordPiece · Linear Warmup With Linear Decay · Attention Dropout · Residual Connection · Adam · Dropout · Weight Decay