Multilingual BERT has an accent: Evaluating English influences on fluency in multilingual models
Isabel Papadimitriou, Kezia Lopez, Dan Jurafsky

TL;DR
This paper reveals that multilingual BERT exhibits an English-like grammatical bias, favoring explicit pronouns and SVO order, which impacts fluency and highlights the need for linguistically-aware evaluation.
Contribution
It introduces a novel method for comparing multilingual and monolingual fluency, demonstrating grammatical structure bias in multilingual models.
Findings
Multilingual BERT prefers English-like grammatical structures.
Bias towards explicit pronouns and SVO order in multilingual models.
Highlights the importance of linguistically-aware fluency evaluation.
Abstract
While multilingual language models can improve NLP performance on low-resource languages by leveraging higher-resource languages, they also reduce average performance on all languages (the 'curse of multilinguality'). Here we show another problem with multilingual models: grammatical structures in higher-resource languages bleed into lower-resource languages, a phenomenon we call grammatical structure bias. We show this bias via a novel method for comparing the fluency of multilingual models to the fluency of monolingual Spanish and Greek models: testing their preference for two carefully-chosen variable grammatical structures (optional pronoun-drop in Spanish and optional Subject-Verb ordering in Greek). We find that multilingual BERT is biased toward the English-like setting (explicit pronouns and Subject-Verb-Object ordering) as compared to our monolingual control language model.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Multi-Head Attention · Attention Is All You Need · Linear Layer · Layer Normalization · Residual Connection · Dropout · Weight Decay · Linear Warmup With Linear Decay · Adam
