Efficient Multi-Source Knowledge Transfer by Model Merging
Marcin Osial, Bartosz W\'ojcik, Bartosz Zieli\'nski, Sebastian Cygert

TL;DR
This paper introduces a scalable, fine-grained multi-source transfer learning method using SVD to decompose, select, and fine-tune principal components from multiple models, enhancing adaptability and efficiency.
Contribution
It proposes a novel SVD-based framework for aggregating knowledge from numerous models, improving scalability and precision over existing coarse-grained methods.
Findings
Effective knowledge aggregation from multiple models in vision and language tasks.
Enhanced scalability and efficiency in transfer learning processes.
Robustness to input and parameter perturbations.
Abstract
While transfer learning is an effective strategy, it often overlooks the opportunity to leverage knowledge from numerous available models online. Addressing this multi-source transfer learning problem is a promising path to boost adaptability and cut re-training costs. However, existing methods remain inherently coarse-grained: they lack the precision needed for fine-grained knowledge extraction as well as the scalability required to aggregate knowledge from either large numbers of source models or models with high parameter counts. We address these limitations by leveraging Singular Value Decomposition (SVD) to first decompose each source model into its elementary, rank-one components. A subsequent aggregation stage then selects only the most salient components from all sources, thereby overcoming the previous efficiency and precision limitations. To best preserve and leverage the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
