Flexible and Effective Mixing of Large Language Models into a Mixture of Domain Experts
Rhui Dih Lee, Laura Wynter, Raghu Kiran Ganti

TL;DR
This paper introduces a toolkit for efficiently creating and configuring mixture-of-domain-experts models from trained models or adapters, facilitating low-cost domain specialization in large language models.
Contribution
It provides a practical toolkit and guidance for building MOE models, enabling flexible and cost-effective domain adaptation of large language models.
Findings
Toolkit supports creation of MOE from models or adapters
Extensive testing and guidance provided for architecture design
Public repository available for community use
Abstract
We present a toolkit for creating low-cost Mixture-of-Domain-Experts (MOE) from trained models. The toolkit can be used for creating a mixture from models or from adapters. We perform extensive tests and offer guidance on defining the architecture of the resulting MOE using the toolkit. A public repository is available.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
- 🤗davzoku/moecule-3x1b-m6-fksmodel· 2 dl2 dl
- 🤗davzoku/moecule-2x1b-m7-fkmodel· 1 dl1 dl
- 🤗davzoku/moecule-2x1b-m8-fsmodel· 3 dl3 dl
- 🤗davzoku/moecule-2x1b-m9-ksmodel· 2 dl2 dl
- 🤗davzoku/moecule-3x3b-m10-fksmodel· 4 dl· ♡ 14 dl♡ 1
- 🤗davzoku/moecule-2x3b-m11-fkmodel· 29 dl29 dl
- 🤗davzoku/moecule-2x3b-m12-fsmodel· 1 dl1 dl
- 🤗davzoku/moecule-2x3b-m13-ksmodel· 1 dl1 dl
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Expert finding and Q&A systems
MethodsMixture of Experts
