TL;DR
Brainstacks introduces a modular, cross-domain continual learning architecture for LLMs that leverages frozen adapter stacks, enabling efficient multi-domain adaptation and transfer of cognitive skills.
Contribution
The paper presents Brainstacks, a novel modular framework with frozen adapter stacks, null-space constraints, and outcome-based routing for continual multi-domain LLM fine-tuning.
Findings
MoE-LoRA achieves 2.5x faster convergence than single LoRA.
Residual boosting surpasses single-stack performance limits.
Outcome-based router identifies transferable cognitive primitives.
Abstract
We present Brainstacks, a modular architecture for continual multi-domain fine-tuning of large language models that packages domain expertise as frozen adapter stacks composing additively on a shared frozen base at inference. Five interlocking components: (1) MoE-LoRA with Shazeer-style noisy top-2 routing across all seven transformer projections under QLoRA 4-bit quantization with rsLoRA scaling; (2) an inner loop performing residual boosting by freezing trained stacks and adding new ones; (3) an outer loop training sequential domain-specific stacks with curriculum-ordered dependencies; (4) null-space projection via randomized SVD constraining new stacks to subspaces orthogonal to prior directions, achieving zero forgetting in isolation; (5) an outcome-based sigmoid meta-router trained on empirically discovered domain-combination targets that selectively weights stacks, enabling…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
