L-MoE: End-to-End Training of a Lightweight Mixture of Low-Rank Adaptation Experts
Shihao Ji, Zihui Song

TL;DR
L-MoE introduces an end-to-end trainable framework combining Mixture of Experts and Low-Rank Adaptation, enabling efficient, modular, and specialized language models through dynamic expert composition.
Contribution
The paper proposes L-MoE, a novel approach that unifies MoE and LoRA into a differentiable, end-to-end trainable model with task-specific low-rank experts and dynamic routing.
Findings
Achieves parameter-efficient language modeling with dynamic expert composition.
Allows end-to-end training of a modular MoE architecture.
Demonstrates improved scalability and specialization in language models.
Abstract
The Mixture of Experts (MoE) architecture enables the scaling of Large Language Models (LLMs) to trillions of parameters by activating a sparse subset of weights for each input, maintaining constant computational cost during inference. Concurrently, Low-Rank Adaptation (LoRA) has emerged as a dominant technique for parameter-efficiently fine-tuning LLMs on specialized tasks. In this work, we unify these two paradigms into a novel, end-to-end trainable framework named L-MoE: a Lightweight Mixture of LoRA Experts. L-MoE redefines MoE experts not as dense feed-forward networks, but as a collection of task-specialized, low-rank adapters. A lightweight gating network, trained jointly with the experts, learns to dynamically compose these LoRA adapters by computing a weighted average of their parameters for each input token. This composition is fully differentiable, allowing gradients from a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Domain Adaptation and Few-Shot Learning · Artificial Intelligence in Healthcare and Education
