Capacity Constraints and the Multilingual Penalty for Lexical Disambiguation
Sean Trott, Pamela D. Rivi\`ere

TL;DR
This paper investigates why multilingual language models underperform compared to monolingual ones in lexical disambiguation, identifying capacity constraints like representation, attention, and vocabulary segmentation as key factors.
Contribution
It quantifies the multilingual penalty for lexical disambiguation and links it to specific capacity limitations in multilingual language models.
Findings
Multilingual LMs show reduced performance in lexical disambiguation.
Capacity constraints such as representation, attention, and vocabulary segmentation explain performance differences.
These constraints account for variance previously attributed solely to multilingual status.
Abstract
Multilingual language models (LMs) sometimes under-perform their monolingual counterparts, possibly due to capacity limitations. We quantify this ``multilingual penalty'' for lexical disambiguation--a task requiring precise semantic representations and contextualization mechanisms--using controlled datasets of human relatedness judgments for ambiguous words in both English and Spanish. Comparing monolingual and multilingual LMs from the same families, we find consistently reduced performance in multilingual LMs. We then explore three potential capacity constraints: representational (reduced embedding isotropy), attentional (reduced attention to disambiguating cues), and vocabulary-related (increased multi-token segmentation). Multilingual LMs show some evidence of all three limitations; moreover, these factors statistically account for the variance formerly attributed to a model's…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Neurobiology of Language and Bilingualism · Natural Language Processing Techniques
