How to Merge Your Multimodal Models Over Time?
Sebastian Dziadzio, Vishaal Udandarao, Karsten Roth, Ameya Prabhu,, Zeynep Akata, Samuel Albanie, Matthias Bethge

TL;DR
This paper introduces TIME, a unified framework for temporal model merging that addresses the challenges of integrating expert models over time as new tasks and domains emerge, providing insights and best practices.
Contribution
The paper proposes the TIME framework to systematically study and improve temporal model merging across different phases, model sizes, and compute budgets.
Findings
Identifies key challenges in temporal model merging.
Provides best practices for initialization, merging techniques, and deployment.
Offers empirical insights on model performance over time.
Abstract
Model merging combines multiple expert models - finetuned from a base foundation model on diverse tasks and domains - into a single, more capable model. However, most existing model merging approaches assume that all experts are available simultaneously. In reality, new tasks and domains emerge progressively over time, requiring strategies to integrate the knowledge of expert models as they become available: a process we call temporal model merging. The temporal dimension introduces unique challenges not addressed in prior work, raising new questions such as: when training for a new task, should the expert model start from the merged past experts or from the original base model? Should we merge all models at each time step? Which merging techniques are best suited for temporal merging? Should different strategies be used to initialize the training and deploy the model? To answer these…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and dialogue systems
MethodsBalanced Selection
