Dynamic Mixture of Curriculum LoRA Experts for Continual Multimodal Instruction Tuning
Chendi Ge, Xin Wang, Zeyang Zhang, Hong Chen, Jiapei Fan, Longtao Huang, Hui Xue, Wenwu Zhu

TL;DR
This paper introduces D-MoLE, a novel architecture-evolving method for continual multimodal instruction tuning that dynamically allocates model capacity to improve adaptation to new tasks while preserving prior knowledge.
Contribution
It proposes a dynamic, architecture-evolving approach with expert allocation and curriculum learning to address task conflicts and modality imbalance in continual MLLM training.
Findings
Achieves 15% average improvement over baselines.
First study of architectural continual learning for MLLMs.
Effectively balances modality updates and task adaptation.
Abstract
Continual multimodal instruction tuning is crucial for adapting Multimodal Large Language Models (MLLMs) to evolving tasks. However, most existing methods adopt a fixed architecture, struggling with adapting to new tasks due to static model capacity. We propose to evolve the architecture under parameter budgets for dynamic task adaptation, which remains unexplored and imposes two challenges: 1) task architecture conflict, where different tasks require varying layer-wise adaptations, and 2) modality imbalance, where different tasks rely unevenly on modalities, leading to unbalanced updates. To address these challenges, we propose a novel Dynamic Mixture of Curriculum LoRA Experts (D-MoLE) method, which automatically evolves MLLM's architecture with controlled parameter budgets to continually adapt to new tasks while retaining previously learned knowledge. Specifically, we propose a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsInnovative Teaching and Learning Methods · Educational Technology and Assessment · Educational Tools and Methods
MethodsADaptive gradient method with the OPTimal convergence rate
