How to Merge Your Multimodal Models Over Time?

Sebastian Dziadzio; Vishaal Udandarao; Karsten Roth; Ameya Prabhu,; Zeynep Akata; Samuel Albanie; Matthias Bethge

arXiv:2412.06712·cs.LG·December 10, 2024

How to Merge Your Multimodal Models Over Time?

Sebastian Dziadzio, Vishaal Udandarao, Karsten Roth, Ameya Prabhu,, Zeynep Akata, Samuel Albanie, Matthias Bethge

PDF

Open Access 1 Repo

TL;DR

This paper introduces TIME, a unified framework for temporal model merging that addresses the challenges of integrating expert models over time as new tasks and domains emerge, providing insights and best practices.

Contribution

The paper proposes the TIME framework to systematically study and improve temporal model merging across different phases, model sizes, and compute budgets.

Findings

01

Identifies key challenges in temporal model merging.

02

Provides best practices for initialization, merging techniques, and deployment.

03

Offers empirical insights on model performance over time.

Abstract

Model merging combines multiple expert models - finetuned from a base foundation model on diverse tasks and domains - into a single, more capable model. However, most existing model merging approaches assume that all experts are available simultaneously. In reality, new tasks and domains emerge progressively over time, requiring strategies to integrate the knowledge of expert models as they become available: a process we call temporal model merging. The temporal dimension introduces unique challenges not addressed in prior work, raising new questions such as: when training for a new task, should the expert model start from the merged past experts or from the original base model? Should we merge all models at each time step? Which merging techniques are best suited for temporal merging? Should different strategies be used to initialize the training and deploy the model? To answer these…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

explainableml/fomo_in_flux
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and dialogue systems

MethodsBalanced Selection