Discontinuous Constituency and BERT: A Case Study of Dutch
Konstantinos Kogkalidis, Gijs Wijnholds

TL;DR
This study evaluates BERT's ability to handle complex Dutch syntactic patterns that are mildly context-sensitive, revealing limitations in its implicit understanding of certain linguistic dependencies.
Contribution
It introduces a novel test suite based on formal grammars to assess BERT's syntactic capabilities on non-context free Dutch structures.
Findings
BERT struggles with control verb nesting.
BERT fails to capture verb raising dependencies.
Extensive analysis shows limitations in BERT's syntactic understanding.
Abstract
In this paper, we set out to quantify the syntactic capacity of BERT in the evaluation regime of non-context free patterns, as occurring in Dutch. We devise a test suite based on a mildly context-sensitive formalism, from which we derive grammars that capture the linguistic phenomena of control verb nesting and verb raising. The grammars, paired with a small lexicon, provide us with a large collection of naturalistic utterances, annotated with verb-subject pairings, that serve as the evaluation test bed for an attention-based span selection probe. Our results, backed by extensive analysis, suggest that the models investigated fail in the implicit acquisition of the dependencies examined.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Speech and dialogue systems · Topic Modeling
MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Attention Dropout · WordPiece · Residual Connection · Layer Normalization · Dropout · Adam · Dense Connections
