No Task Left Behind: Isotropic Model Merging with Common and Task-Specific Subspaces
Daniel Marczak, Simone Magistri, Sebastian Cygert, Bart{\l}omiej Twardowski, Andrew D. Bagdanov, Joost van de Weijer

TL;DR
This paper introduces an isotropic merging framework that aligns task-specific matrices to effectively combine multiple models, significantly reducing performance gaps and achieving state-of-the-art results in multi-task learning.
Contribution
It proposes a novel isotropic merging method that flattens singular value spectra and incorporates common and task-specific subspaces for improved model merging.
Findings
Achieves state-of-the-art performance on vision and language tasks.
Enhances alignment between task matrices improves merging effectiveness.
Reduces performance gap without additional training.
Abstract
Model merging integrates the weights of multiple task-specific models into a single multi-task model. Despite recent interest in the problem, a significant performance gap between the combined and single-task models remains. In this paper, we investigate the key characteristics of task matrices -- weight update matrices applied to a pre-trained model -- that enable effective merging. We show that alignment between singular components of task-specific and merged matrices strongly correlates with performance improvement over the pre-trained model. Based on this, we propose an isotropic merging framework that flattens the singular value spectrum of task matrices, enhances alignment, and reduces the performance gap. Additionally, we incorporate both common and task-specific subspaces to further improve alignment and performance. Our proposed approach achieves state-of-the-art performance on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsMulti-Agent Systems and Negotiation · Simulation Techniques and Applications · Scientific Computing and Data Management
