Generalizing the Geometry of Model Merging Through Frechet Averages
Marvin F. da Silva, Mohammed Adnan, Felix Dangel, Sageev Oore

TL;DR
This paper introduces a symmetry-aware model merging method using Fréchet averaging on manifolds, improving robustness over naive averaging, with specific focus on low-rank adapters like LoRA.
Contribution
It proposes a general framework for symmetry-invariant model merging via Fréchet averaging, encompassing Fisher merging and addressing LoRA-specific geometries.
Findings
Fréchet averaging provides a robust, symmetry-aware model merging technique.
The approach generalizes Fisher merging and improves LoRA merging methods.
Experimental results show improved model integration performance.
Abstract
Model merging aims to combine multiple models into one without additional training. Na\"ive parameter-space averaging can be fragile under architectural symmetries, as their geometry does not take them into account. In this work we show that not only the geometry, but also the averaging procedure itself, must be symmetry-invariant to achieve symmetry-aware merges. Consequently, we propose a general solution: merging as Fr\'echet averaging, i.e., selecting parameters that minimize a sum of geodesic distances on an appropriate manifold. In this view, the key design choice is the overall geometry, i.e., the choice of metric, manifold, and distance approximation, that determines what it means for two models to be "close". We show that Fr\'echet averaging, combined with simplifying assumptions, contains Fisher merging. Building on this, we examine the particular case of low-rank adapters…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
