NeuronMoE: Neuron-Guided Mixture-of-Experts for Efficient Multilingual LLM Extension
Rongzhi Li, Hitomi Yanaka

TL;DR
NeuronMoE introduces a neuron-guided approach to efficiently extend multilingual large language models by optimizing expert allocation based on neuron diversity, reducing parameters while maintaining performance.
Contribution
The paper presents NeuronMoE, a novel method that analyzes neuron-level specialization to guide expert allocation in multilingual models, improving efficiency over layer-based methods.
Findings
Achieves ~40% parameter reduction with maintained performance.
Low-resource languages develop neuron specialization patterns similar to high-resource languages.
Specialization is concentrated in early and late layers, indicating universal principles.
Abstract
Extending large language models to low-resource languages is essential for global accessibility, but training separate models per language is prohibitively expensive. Mixture-of-Experts (MoE) architectures address this by adding sparse language-specific parameters, but determining how many experts each layer needs remains an open question. Current approaches allocate experts based on layer-level similarity, yet language processing exhibits fine-grained specialization at individual neurons. We propose , a method that analyzes language-specific neurons across all transformer components to guide expert allocation per layer based on empirically measured cross-lingual neuron diversity. Applied to Llama-3.2-3B for low-resource languages (Greek, Turkish, and Hungarian), this approach achieves approximately 40% average parameter reduction while matching the performance of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Multimodal Machine Learning Applications · Natural Language Processing Techniques
