Model Breadcrumbs: Scaling Multi-Task Model Merging with Sparse Masks
MohammadReza Davari, Eugene Belilovsky

TL;DR
Model Breadcrumbs is a novel method for merging multiple fine-tunings of foundation models using sparse weight masks, enabling efficient multi-task learning and model updates without extensive hyperparameter tuning.
Contribution
Introduces Model Breadcrumbs, a simple and efficient approach for multi-task model merging using sparse weight masks, avoiding hyperparameter tuning for new tasks.
Findings
Effective in improving multi-task performance
More efficient than previous methods
Does not require hyperparameter tuning for new tasks
Abstract
The rapid development of AI systems has been greatly influenced by the emergence of foundation models. A common approach for targeted problems involves fine-tuning these pre-trained foundation models for specific target tasks, resulting in a rapid spread of models fine-tuned across a diverse array of tasks. This work focuses on the problem of merging multiple fine-tunings of the same foundation model derived from a spectrum of auxiliary tasks. We introduce a new simple method, Model Breadcrumbs, which consists of a sparsely defined weight set that guides model adaptation within the weight space of a pre-trained model. These breadcrumbs are constructed by subtracting the weights from a pre-trained model before and after fine-tuning, followed by a sparsification process that eliminates weight outliers and negligible perturbations. Our experiments demonstrate the effectiveness of Model…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
- 🤗Aleteian/Pathfinder-RP-12B-RUmodel· 8 dl· ♡ 18 dl♡ 1
- 🤗Nohobby/ignore_Q2.5-testmodel· 2 dl· ♡ 22 dl♡ 2
- 🤗Aleteian/Legend-of-the-Four-Winds-MN-12Bmodel· 7 dl· ♡ 27 dl♡ 2
- 🤗mergekit-community/L3.3-Test-Step2model
- 🤗Aleteian/Way-to-Unseen-Horizon-MN-12Bmodel· 8 dl· ♡ 38 dl♡ 3
- 🤗estrogen/ms-24b-toasty-heal-mergemodel· 2 dl· ♡ 12 dl♡ 1
- 🤗estrogen/test-mergekitty-guimodel· 1 dl1 dl
- 🤗Aleteian/Legend-of-the-Four-Winds-2-MN-12Bmodel· 5 dl· ♡ 25 dl♡ 2
- 🤗TareksGraveyard/Pathos-Beta-LLaMa-70Bmodel· 4 dl4 dl
- 🤗TareksGraveyard/Pathos-Eta-LLaMa-70Bmodel· 4 dl· ♡ 14 dl♡ 1
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Machine Learning and Data Classification · Anomaly Detection Techniques and Applications
MethodsSparse Evolutionary Training
