Beyond Parameter Arithmetic: Sparse Complementary Fusion for Distribution-Aware Model Merging
Weihong Lin, Lin Sun, Qilong Shi, Aomufei Yuan, Yuxuan Tian, Zhengyang Wang, Guangxiang Zhao, Xiangzheng Zhang, Tong Yang

TL;DR
This paper introduces Sparse Complementary Fusion with reverse KL (SCF-RKL), a novel model merging method that improves stability and generalization by controlling functional interference through sparse, distribution-aware updates, outperforming existing methods across diverse benchmarks.
Contribution
The paper presents SCF-RKL, a new model merging framework that explicitly measures and minimizes functional divergence using reverse KL divergence, enabling more stable and effective model integration.
Findings
SCF-RKL outperforms existing merging methods on 24 benchmarks.
It maintains stable representations while integrating new capabilities.
The approach improves generalization and generation stability.
Abstract
Model merging has emerged as a promising paradigm for composing the capabilities of large language models by directly operating in weight space, enabling the integration of specialized models without costly retraining. However, existing merging methods largely rely on parameter-space heuristics, which often introduce severe interference, leading to degraded generalization and unstable generation behaviors such as repetition and incoherent outputs. In this work, we propose Sparse Complementary Fusion with reverse KL (SCF-RKL), a novel model merging framework that explicitly controls functional interference through sparse, distribution-aware updates. Instead of assuming linear additivity in parameter space, SCF-RKL measures the functional divergence between models using reverse Kullback-Leibler divergence and selectively incorporates complementary parameters. This mode-seeking,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Topic Modeling · Domain Adaptation and Few-Shot Learning
