AT-MoE: Adaptive Task-planning Mixture of Experts via LoRA Approach
Xurui Li, Juanjuan Yao

TL;DR
This paper presents AT-MoE, a novel architecture that uses LoRA-trained experts and adaptive routing to improve task-specific learning, interpretability, and performance in complex AI tasks, especially in sensitive fields like medicine.
Contribution
It introduces a new adaptive routing mechanism and LoRA-based expert training to enhance task-specific performance and interpretability of Mixture of Experts models.
Findings
Improved task-specific accuracy in complex tasks.
Enhanced interpretability of expert modules.
Effective adaptive routing for complex instructions.
Abstract
The advent of Large Language Models (LLMs) has ushered in a new era of artificial intelligence, with the potential to transform various sectors through automation and insightful analysis. The Mixture of Experts (MoE) architecture has been proposed as a solution to enhance model performance in complex tasks. Yet, existing MoE models struggle with task-specific learning and interpretability, especially in fields like medicine where precision is critical. This paper introduces the Adaptive Task-planing Mixture of Experts(AT-MoE), an innovative architecture designed to address these limitations. We first train task-specific experts via LoRA approach to enhance problem-solving capabilities and interpretability in specialized areas. Subsequently, we introduce a layer-wise adaptive grouped routing module that optimizes module fusion based on complex task instructions, ensuring optimal task…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman-Automation Interaction and Safety · Big Data and Business Intelligence
MethodsMixture of Experts · Weight Normalization
