Assembly of Experts: Linear-time construction of the Chimera LLM variants with emergent and adaptable behaviors

Henrik Klagges; Robert Dahlke; Fabian Klemm; Benjamin Merkel; Daniel Klingmann; David A. Reiss; Dan Zecha

arXiv:2506.14794·cs.LG·June 19, 2025

Assembly of Experts: Linear-time construction of the Chimera LLM variants with emergent and adaptable behaviors

Henrik Klagges, Robert Dahlke, Fabian Klemm, Benjamin Merkel, Daniel Klingmann, David A. Reiss, Dan Zecha

PDF

Open Access 10 Models

TL;DR

This paper introduces a linear-time method to create hybrid language models by interpolating weights from parent models, resulting in functional, adaptable, and efficient Chimera variants with emergent behaviors.

Contribution

The paper presents the Assembly-of-Experts method for rapid construction of hybrid LLMs, enabling new models with emergent traits without fine-tuning or distillation.

Findings

01

Nearly all generated models are functional and capable.

02

The Chimera model achieves R1-level intelligence with 40% fewer tokens.

03

Behavioral traits change gradually or abruptly depending on weight interpolation.

Abstract

Requiring $1 0^{13}$ - $1 0^{15}$ FLOPs to calculate one 8 bit weight in an LLM during pretraining is extremely expensive and seems inefficient. To better leverage the huge investments made into pretrained models, we develop the new "Assembly-of-Experts" (AoE) construction method to create capable child variants of existing Mixture-of-Experts parent models in linear time. Model weight tensors get interpolated individually, allowing to enhance or suppress semantic features of the parents. Varying the proportion of weights taken from the parent models, we observe some properties of the AoE child model changing gradually, while other behavioral traits emerge with a sharp transition. Surprisingly, nearly every generated model is functional and capable, which makes searching the model space straightforward. We construct the DeepSeek R1T "Chimera", a 671B open-weights hybrid model combining…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsModular Robots and Swarm Intelligence · Multi-Agent Systems and Negotiation · Scheduling and Optimization Algorithms