Merging Models on the Fly Without Retraining: A Sequential Approach to Scalable Continual Model Merging
Anke Tang, Enneng Yang, Li Shen, Yong Luo, Han Hu, Bo Du, Dacheng Tao

TL;DR
This paper introduces a training-free, projection-based method for sequentially merging deep models without retraining, reducing memory use and interference, and improving accuracy on CLIP-ViT models.
Contribution
It presents a novel continual merging technique that operates without retraining, using orthogonal projections and adaptive scaling to efficiently combine models sequentially.
Findings
Achieves 5-8% average accuracy improvement on CLIP-ViT models.
Maintains constant memory complexity regardless of the number of models.
Reduces task interference through orthogonal projections.
Abstract
Deep model merging represents an emerging research direction that combines multiple fine-tuned models to harness their specialized capabilities across different tasks and domains. Current model merging techniques focus on merging all available models simultaneously, with weight interpolation-based methods being the predominant approaches. However, these conventional approaches are not well-suited for scenarios where models become available sequentially, and they often suffer from high memory requirements and potential interference between tasks. In this study, we propose a training-free projection-based continual merging method that processes models sequentially through orthogonal projections of weight matrices and adaptive scaling mechanisms. Our method operates by projecting new parameter updates onto subspaces orthogonal to existing merged parameter updates while using an adaptive…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMulti-Agent Systems and Negotiation · Simulation Techniques and Applications
MethodsFocus
