Domain-Adaptive Model Merging Across Disconnected Modes

Junming Liu; Yusen Zhang; Rongchao Zhang; Wenkai Zhu; Tian Wu

arXiv:2603.05957·cs.DC·April 16, 2026

Domain-Adaptive Model Merging Across Disconnected Modes

Junming Liu, Yusen Zhang, Rongchao Zhang, Wenkai Zhu, Tian Wu

PDF

TL;DR

This paper introduces DMM, a novel data-free model merging framework that effectively combines highly divergent models across different domains while preserving critical knowledge.

Contribution

DMM is the first framework to handle highly divergent models through a three-step process involving model similarity merging and pseudo-data distillation.

Findings

01

DMM achieves state-of-the-art performance on various benchmarks.

02

It effectively merges models with high divergence without data sharing.

03

DMM preserves rare and critical knowledge during merging.

Abstract

Learning across domains is challenging when data cannot be centralized due to privacy or heterogeneity, which limits the ability to train a single comprehensive model. Model merging provides an appealing alternative by consolidating knowledge from multiple specialized models into one, avoiding data sharing and reducing retraining cost. In this work, we present DMM, a data-free model merging framework designed to handle highly divergent models. DMM proceeds in three steps. First, domain-specific models are trained independently. Second, models with high similarity are merged using standard techniques to ensure stability. Third, we synthesize pseudo-data from normalization statistics and distill knowledge from divergent models into the merged model through a lightweight refinement guided by these samples. This approach preserves rare but critical knowledge while maintaining stability.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.