Language Bias under Conflicting Information in Multilingual LLMs
Robert \"Ostling, Murathan Kurfal{\i}

TL;DR
This study investigates language biases in multilingual LLMs when faced with conflicting information, revealing consistent language preferences and biases across models, regardless of training origin.
Contribution
It extends the conflicting needles in a haystack paradigm to multilingual models and provides a comprehensive evaluation of language bias patterns.
Findings
All tested LLMs ignore conflicts and assert a single answer confidently.
Models show bias against Russian and favor Chinese with longer contexts.
Bias patterns are consistent across models trained inside and outside China.
Abstract
Large Language Models (LLMs) have been shown to contain biases in the process of integrating conflicting information when answering questions. Here we ask whether such biases also exist with respect to which language is used for each conflicting piece of information. To answer this question, we extend the conflicting needles in a haystack paradigm to a multilingual setting and perform a comprehensive set of evaluations with naturalistic news domain data in five different languages, for a range of multilingual LLMs of different sizes. We find that all LLMs tested, including GPT-5.2, ignore the conflict and confidently assert only one of the possible answers in the large majority of cases. Furthermore, there is a consistent bias across models in which languages are preferred, with a general bias against Russian and, for the longest context lengths, in favor of Chinese. Both of these…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
