Mix-of-Language-Experts Architecture for Multilingual Programming
Yifan Zong, Yuntian Deng, Pengyu Nie

TL;DR
This paper presents MoLE, a novel architecture that efficiently balances language-specific specialization and parameter sharing for multilingual programming tasks using a mix-of-experts approach with LoRA modules.
Contribution
MoLE introduces a joint optimization of shared and language-specific LoRA modules, enabling efficient and specialized multilingual programming models.
Findings
MoLE outperforms single shared LLMs in accuracy across multiple programming languages.
MoLE achieves greater parameter efficiency than training separate language-specific models.
MoLE effectively balances specialization and efficiency in multilingual programming tasks.
Abstract
Large language models (LLMs) have demonstrated impressive capabilities in aiding developers with tasks like code comprehension, generation, and translation. Supporting multilingual programming -- i.e., coding tasks across multiple programming languages -- typically requires either (1) finetuning a single LLM across all programming languages, which is cost-efficient but sacrifices language-specific specialization and performance, or (2) finetuning separate LLMs for each programming language, which allows for specialization but is computationally expensive and storage-intensive due to the duplication of parameters. This paper introduces MoLE (Mix-of-Language-Experts), a novel architecture that balances efficiency and specialization for multilingual programming. MoLE is composed of a base model, a shared LoRA (low-rank adaptation) module, and a collection of language-specific LoRA modules.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDistributed and Parallel Computing Systems · Service-Oriented Architecture and Web Services · Multi-Agent Systems and Negotiation
MethodsBalanced Selection
