TL;DR
This paper investigates how multilingual alignment improves LLMs' capabilities by analyzing language neurons, proposing a neuron classification method, and examining internal processes before and after alignment.
Contribution
It introduces a ternary neuron classification and an identification algorithm, offering new insights into the mechanisms of multilingual alignment in LLMs.
Findings
Neurons can be categorized into language-specific, language-related, and general types.
Multilingual inference involves understanding, reasoning, transformation, and output stages.
Analysis reveals phenomena of spontaneous multilingual alignment.
Abstract
Multilingual Alignment is an effective and representative paradigm to enhance LLMs' multilingual capabilities, which transfers the capabilities from the high-resource languages to the low-resource languages. Meanwhile, some research on language-specific neurons provides a new perspective to analyze and understand LLMs' mechanisms. However, we find that there are many neurons that are shared by multiple but not all languages and cannot be correctly classified. In this work, we propose a ternary classification methodology that categorizes neurons into three types, including language-specific neurons, language-related neurons, and general neurons. And we propose a corresponding identification algorithm to distinguish these different types of neurons. Furthermore, based on the distributional characteristics of different types of neurons, we divide the LLMs' internal process for multilingual…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
