Cross-Lingual Generalization and Compression: From Language-Specific to Shared Neurons

Frederick Riemenschneider; Anette Frank

arXiv:2506.01629·cs.CL·June 3, 2025

Cross-Lingual Generalization and Compression: From Language-Specific to Shared Neurons

Frederick Riemenschneider, Anette Frank

PDF

Open Access 1 Video

TL;DR

This paper investigates how multilingual language models develop shared representations during pre-training, revealing a transition from language-specific to cross-lingual abstractions and neuron alignment across languages.

Contribution

It provides a detailed analysis of representation evolution in MLLMs, highlighting the emergence of shared neurons and cross-lingual concepts during training.

Findings

01

Models initially form language-specific representations.

02

Representations gradually converge into cross-lingual abstractions.

03

Neurons become reliable predictors for concepts across languages.

Abstract

Multilingual language models (MLLMs) have demonstrated remarkable abilities to transfer knowledge across languages, despite being trained without explicit cross-lingual supervision. We analyze the parameter spaces of three MLLMs to study how their representations evolve during pre-training, observing patterns consistent with compression: models initially form language-specific representations, which gradually converge into cross-lingual abstractions as training progresses. Through probing experiments, we observe a clear transition from uniform language identification capabilities across layers to more specialized layer functions. For deeper analysis, we focus on neurons that encode distinct semantic concepts. By tracing their development during pre-training, we show how they gradually align across languages. Notably, we identify specific neurons that emerge as increasingly reliable…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Cross-lingual Generalization and Compression: From Language-Specific to Shared Neurons· underline

Taxonomy

TopicsNatural Language Processing Techniques