AT-MoE: Adaptive Task-planning Mixture of Experts via LoRA Approach

Xurui Li; Juanjuan Yao

arXiv:2410.10896·cs.LG·October 22, 2024

AT-MoE: Adaptive Task-planning Mixture of Experts via LoRA Approach

Xurui Li, Juanjuan Yao

PDF

Open Access

TL;DR

This paper presents AT-MoE, a novel architecture that uses LoRA-trained experts and adaptive routing to improve task-specific learning, interpretability, and performance in complex AI tasks, especially in sensitive fields like medicine.

Contribution

It introduces a new adaptive routing mechanism and LoRA-based expert training to enhance task-specific performance and interpretability of Mixture of Experts models.

Findings

01

Improved task-specific accuracy in complex tasks.

02

Enhanced interpretability of expert modules.

03

Effective adaptive routing for complex instructions.

Abstract

The advent of Large Language Models (LLMs) has ushered in a new era of artificial intelligence, with the potential to transform various sectors through automation and insightful analysis. The Mixture of Experts (MoE) architecture has been proposed as a solution to enhance model performance in complex tasks. Yet, existing MoE models struggle with task-specific learning and interpretability, especially in fields like medicine where precision is critical. This paper introduces the Adaptive Task-planing Mixture of Experts(AT-MoE), an innovative architecture designed to address these limitations. We first train task-specific experts via LoRA approach to enhance problem-solving capabilities and interpretability in specialized areas. Subsequently, we introduce a layer-wise adaptive grouped routing module that optimizes module fusion based on complex task instructions, ensuring optimal task…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman-Automation Interaction and Safety · Big Data and Business Intelligence

MethodsMixture of Experts · Weight Normalization