FlexMoRE: A Flexible Mixture of Rank-heterogeneous Experts for Efficient Federatedly-trained Large Language Models
Annemette Brok Pirchert, Jacob Nielsen, Mogens Henrik From, Lukas Galke Poech, and Peter Schneider-Kamp

TL;DR
FlexMoRE introduces a flexible mixture of experts with variable ranks, optimizing performance and memory efficiency in federated large language models by tailoring expert size to task complexity.
Contribution
The paper proposes FlexMoRE, a novel mixture-of-experts architecture with rank-heterogeneous experts, demonstrating improved performance and efficiency over full-sized expert models.
Findings
Optimal expert rank varies with task type.
FlexMoRE outperforms full-sized experts in downstream tasks.
Significant memory savings with maintained or improved accuracy.
Abstract
Recent advances in mixture-of-experts architectures have shown that individual experts models can be trained federatedly, i.e., in isolation from other experts by using a common base model to facilitate coordination. However, we hypothesize that full-sized experts may not be necessary for all domains and that instead low-rank adapters may be sufficient. Here, we introduce FlexMoRE, a Flexible Mixture of Rank-heterogenous Experts, which may be either full-sized experts or adapters of a suitable rank. We systematically investigate the trade-off between expert rank and downstream task performance by evaluating experts with ranks to resulting in experiments covering 150 mixtures (96 with 2 experts, 54 with 7 experts) that are evaluated across tasks. For our experiments, we build on FlexOlmo and turn its pre-trained experts into low-rank versions. Our regression…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMobile Crowdsensing and Crowdsourcing · Domain Adaptation and Few-Shot Learning · Expert finding and Q&A systems
