The Emergence of Abstract Thought in Large Language Models Beyond Any Language
Yuxin Chen, Yiran Zhao, Yang Zhang, An Zhang, Kenji Kawaguchi, Shafiq Joty, Junnan Li, Tat-Seng Chua, Michael Qizhe Shieh, Wenxuan Zhang

TL;DR
This paper reveals that large language models develop a small, critical set of language-agnostic parameters that underpin their ability to think abstractly across multiple languages, challenging the idea they operate solely in English.
Contribution
It identifies a core language-agnostic parameter space in LLMs and demonstrates how shared neurons support abstract thought beyond specific languages.
Findings
Development of a small, critical parameter subset essential for multilingual performance
Increase in shared neurons correlates with model development stages
Neuron-specific training improves multilingual generalization
Abstract
As large language models (LLMs) continue to advance, their capacity to function effectively across a diverse range of languages has shown marked improvement. Preliminary studies observe that the hidden activations of LLMs often resemble English, even when responding to non-English prompts. This has led to the widespread assumption that LLMs may "think" in English. However, more recent results showing strong multilingual performance, even surpassing English performance on specific tasks in other languages, challenge this view. In this work, we find that LLMs progressively develop a core language-agnostic parameter space-a remarkably small subset of parameters whose deactivation results in significant performance degradation across all languages. This compact yet critical set of parameters underlies the model's ability to generalize beyond individual languages, supporting the emergence of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsNeurobiology of Language and Bilingualism · Topic Modeling · Artificial Intelligence in Healthcare and Education
