Word-order typology in Multilingual BERT: A case study in subordinate-clause detection
Dmitry Nikolaev, Sebastian Pad\'o

TL;DR
This paper investigates how multilingual BERT detects subordinate clauses across languages, revealing that its zero-shot performance is heavily influenced by word order typology, with easy cases and a long tail of harder instances.
Contribution
It demonstrates that BERT's ability to identify subordinate clauses is primarily driven by word order patterns, highlighting limitations in learning syntactic abstractions across languages.
Findings
BERT's zero-shot subordinate-clause detection is influenced by word order typology.
Easy cases are quickly learned, but harder cases form a long tail.
Word order effects dominate BERT's cross-lingual syntactic performance.
Abstract
The capabilities and limitations of BERT and similar models are still unclear when it comes to learning syntactic abstractions, in particular across languages. In this paper, we use the task of subordinate-clause detection within and across languages to probe these properties. We show that this task is deceptively simple, with easy gains offset by a long tail of harder cases, and that BERT's zero-shot performance is dominated by word-order effects, mirroring the SVO/VSO/SOV typology.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Speech and dialogue systems
MethodsAttention Is All You Need · Linear Layer · Weight Decay · Linear Warmup With Linear Decay · Dense Connections · Dropout · WordPiece · Refunds@Expedia|||How do I get a full refund from Expedia? · Adam · Attention Dropout
