TL;DR
This paper investigates whether BERT captures constructional information, finding that it does encode significant constructional knowledge, which has implications for understanding what deep learning models learn from language.
Contribution
The study introduces probes and experiments demonstrating BERT's access to constructional information, bridging linguistic theory and deep learning.
Findings
BERT encodes constructional information extensively.
Constructional information is redundantly represented in lexico-semantics.
BERT's knowledge includes aspects traditionally associated with constructions.
Abstract
While lexico-semantic elements no doubt capture a large amount of linguistic information, it has been argued that they do not capture all information contained in text. This assumption is central to constructionist approaches to language which argue that language consists of constructions, learned pairings of a form and a function or meaning that are either frequent or have a meaning that cannot be predicted from its component parts. BERT's training objectives give it access to a tremendous amount of lexico-semantic information, and while BERTology has shown that BERT captures certain important linguistic dimensions, there have been no studies exploring the extent to which BERT might have access to constructional information. In this work we design several probes and conduct extensive experiments to answer this question. Our results allow us to conclude that BERT does indeed have access…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsLinear Layer · Attention Dropout · Dropout · Softmax · Multi-Head Attention · Refunds@Expedia|||How do I get a full refund from Expedia? · Residual Connection · Dense Connections · WordPiece · Linear Warmup With Linear Decay
