Polyglots or Multitudes? Multilingual LLM Answers to Value-laden Multiple-Choice Questions
L\'eo Labat, Etienne Ollion, Fran\c{c}ois Yvon

TL;DR
This study investigates whether multilingual large language models respond consistently across languages on value-laden multiple-choice questions, revealing that larger, instruction-tuned models show higher consistency but still exhibit language-specific variations on certain questions.
Contribution
The paper introduces the Multilingual European Value Survey (MEVS) dataset and systematically analyzes multilingual LLM responses across languages and models, highlighting the effects of size and fine-tuning.
Findings
Larger, instruction-tuned models are more consistent across languages.
Response robustness varies greatly depending on the specific question.
Language-specific behavior appears in all consistent, fine-tuned models on certain questions.
Abstract
Multiple-Choice Questions (MCQs) are often used to assess knowledge, reasoning abilities, and even values encoded in large language models (LLMs). While the effect of multilingualism has been studied on LLM factual recall, this paper seeks to investigate the less explored question of language-induced variation in value-laden MCQ responses. Are multilingual LLMs consistent in their responses across languages, i.e. behave like theoretical polyglots, or do they answer value-laden MCQs depending on the language of the question, like a multitude of monolingual models expressing different values through a single model? We release a new corpus, the Multilingual European Value Survey (MEVS), which, unlike prior work relying on machine translation or ad hoc prompts, solely comprises human-translated survey questions aligned in 8 European languages. We administer a subset of those questions to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsTopic Modeling · Sentiment Analysis and Opinion Mining · Natural Language Processing Techniques
