Beyond English-Centric LLMs: What Language Do Multilingual Language Models Think in?
Chengzhi Zhong, Fei Cheng, Qianying Liu, Junfeng Jiang, Zhen Wan,, Chenhui Chu, Yugo Murawaki, Sadao Kurohashi

TL;DR
This paper investigates the internal language representations of multilingual LLMs, revealing how models activate different latent languages internally and how these shift across layers during multilingual tasks.
Contribution
It introduces the concept of latent internal languages in LLMs and analyzes their activation patterns across different models and languages, especially Japanese and English.
Findings
Llama2 relies solely on English as its internal latent language.
Swallow and LLM-jp utilize both Japanese and English as latent languages.
Models activate the latent language most related to the target language.
Abstract
In this study, we investigate whether non-English-centric LLMs, despite their strong performance, `think' in their respective dominant language: more precisely, `think' refers to how the representations of intermediate layers, when un-embedded into the vocabulary space, exhibit higher probabilities for certain dominant languages during generation. We term such languages as internal . We examine the latent language of three typical categories of models for Japanese processing: Llama2, an English-centric model; Swallow, an English-centric model with continued pre-training in Japanese; and LLM-jp, a model pre-trained on balanced English and Japanese corpora. Our empirical findings reveal that, unlike Llama2 which relies exclusively on English as the internal latent language, Japanese-specific Swallow and LLM-jp employ both Japanese and English, exhibiting dual…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Legal Language and Interpretation · linguistics and terminology studies
