Lost in the Tower of Babel: The Adverse Effects of Incidental Multilingualism in LLMs
Anjishnu Mukherjee, Chutong Meng, Antonios Anastasopoulos

TL;DR
This paper critiques the current incidental multilingualism paradigm in LLMs, highlighting its flaws and proposing a shift towards intentional multilingual design for fairness and robustness.
Contribution
It provides an empirical analysis of language support and response accuracy in LLMs and advocates for multilingualism by design as a core research goal.
Findings
LLMs often support languages differently than they respond in.
Simple language-change attacks can reveal hidden language assumptions.
Current models exhibit brittle and unequal multilingual behavior.
Abstract
This paper argues that contemporary multilingual NLP has converged on a fragile and misleading paradigm of incidental multilingualism. Today's LLMs appear multilingual largely because they are trained on massive, uneven web corpora, not because multilingual or multicultural competence has been treated as a core design objective. We contend that this paradigm systematically produces unequal, brittle, and opaque behavior across languages, with severe consequences in real-world and agentic deployments where models must reason, plan, and act across multiple linguistic contexts. We report a focused empirical study of two practical questions: which languages models self-report as supported and which languages they actually respond in across multilingual prompts. We additionally demonstrate how even a simple language-change attack can surface these failures and expose hidden assumptions about…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
