MoA: Heterogeneous Mixture of Adapters for Parameter-Efficient Fine-Tuning of Large Language Models

Jie Cao; Tianwei Lin; Bo Yuan; Rolan Yan; Hongyang He; Wenqiao Zhang; Juncheng Li; Dongping Zhang; Siliang Tang; Yueting Zhuang

arXiv:2506.05928·cs.CL·January 21, 2026

MoA: Heterogeneous Mixture of Adapters for Parameter-Efficient Fine-Tuning of Large Language Models

Jie Cao, Tianwei Lin, Bo Yuan, Rolan Yan, Hongyang He, Wenqiao Zhang, Juncheng Li, Dongping Zhang, Siliang Tang, Yueting Zhuang

PDF

Open Access 1 Models

TL;DR

The paper introduces a heterogeneous Mixture-of-Adapters (MoA) approach for parameter-efficient fine-tuning of large language models, addressing limitations of homogeneous methods by leveraging diverse adapter structures for improved performance and efficiency.

Contribution

It proposes a novel heterogeneous MoA method with soft and sparse variants, enhancing expert specialization and transfer learning in LLM fine-tuning.

Findings

01

Heterogeneous MoA outperforms homogeneous MoE-LoRA in accuracy.

02

MoA achieves better parameter efficiency.

03

Sparse MoA maintains performance with sparse expert activation.

Abstract

Recent studies integrate Low-Rank Adaptation (LoRA) and Mixture-of-Experts (MoE) to further enhance the performance of parameter-efficient fine-tuning (PEFT) methods in Large Language Model (LLM) applications. Existing methods employ \emph{homogeneous} MoE-LoRA architectures composed of LoRA experts with either similar or identical structures and capacities. However, these approaches often suffer from representation collapse and expert load imbalance, which negatively impact the potential of LLMs. To address these challenges, we propose a \emph{heterogeneous} \textbf{Mixture-of-Adapters (MoA)} approach. This method dynamically integrates PEFT adapter experts with diverse structures, leveraging their complementary representational capabilities to foster expert specialization, thereby enhancing the effective transfer of pre-trained knowledge to downstream tasks. MoA supports two variants:…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

🤗
cajie/Soft_MoA
model

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Domain Adaptation and Few-Shot Learning · Artificial Intelligence in Healthcare and Education