A Sensitivity-Driven Expert Allocation Method in LoRA-MoE for Efficient Fine-Tuning
Junzhou Xu, Boyu Diao

TL;DR
This paper introduces LoRA-SMoE, a sensitivity-driven expert allocation method for efficient fine-tuning of large models, which improves performance and reduces parameters with minimal computational overhead.
Contribution
It proposes a novel sensitivity-based expert allocation technique within LoRA-MoE, enhancing fine-tuning efficiency and performance in resource-constrained environments.
Findings
Outperforms SOTA fine-tuning methods in accuracy.
Reduces the number of trainable parameters.
Maintains low memory consumption during training.
Abstract
As deep learning models expand, the pre-training-fine-tuning paradigm has become the standard approach for handling various downstream tasks. However, shared parameters can lead to diminished performance when dealing with complex datasets involving multiple tasks. While introducing Mixture-of-Experts (MoE) methods has alleviated this issue to some extent, it also significantly increases the number of parameters required for fine-tuning and training time, introducing greater parameter redundancy. To address these challenges, we propose a method for allocating expert numbers based on parameter sensitivity LoRA-SMoE (A Sensitivity-Driven Expert Allocation Method in LoRA-MoE for Efficient Fine-Tuning). This method rapidly assesses the sensitivity of different tasks to parameters by sampling a small amount of data and using gradient information. It then adaptively allocates expert numbers…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Domain Adaptation and Few-Shot Learning · Advanced Neural Network Applications
