MINGLE: Mixture of Null-Space Gated Low-Rank Experts for Test-Time Continual Model Merging
Zihuan Qiu, Yi Xu, Chiyuan He, Fanman Meng, Linfeng Xu, Qingbo Wu, Hongliang Li

TL;DR
MINGLE introduces a novel test-time continual model merging framework using a mixture-of-experts with null-space gating and adaptive constraints, effectively reducing forgetting and improving adaptability in sequential task learning.
Contribution
It proposes MINGLE, a new framework combining low-rank experts, null-space gating, and adaptive strategies for improved test-time continual model merging.
Findings
Outperforms previous methods by 7-9% on average across benchmarks.
Effectively reduces catastrophic forgetting during sequential model merging.
Enhances adaptability to evolving test distributions.
Abstract
Continual model merging integrates independently fine-tuned models sequentially without access to the original training data, offering a scalable and efficient solution for continual learning. However, existing methods face two critical challenges: parameter interference among tasks, which leads to catastrophic forgetting, and limited adaptability to evolving test distributions. To address these issues, we introduce the task of Test-Time Continual Model Merging (TTCMM), which leverages a small set of unlabeled test samples during inference to alleviate parameter conflicts and handle distribution shifts. We propose MINGLE, a novel framework for TTCMM. MINGLE employs a mixture-of-experts architecture with parameter-efficient, low-rank experts, which enhances adaptability to evolving test distributions while dynamically merging models to mitigate conflicts. To further reduce forgetting, we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsModel Reduction and Neural Networks · Gaussian Processes and Bayesian Inference · Seismology and Earthquake Studies
MethodsSparse Evolutionary Training
