Activated Parameter Locating via Causal Intervention for Model Merging

Fanshuang Kong; Richong Zhang; Ziqiao Wang

arXiv:2408.09485·cs.CL·August 20, 2024

Activated Parameter Locating via Causal Intervention for Model Merging

Fanshuang Kong, Richong Zhang, Ziqiao Wang

PDF

Open Access

TL;DR

This paper introduces a causal intervention-based method called Activated Parameter Locating (APL) for model merging, which improves parameter selection by estimating importance more accurately, leading to better conflict resolution and performance.

Contribution

The paper proposes a novel APL method utilizing causal intervention for precise parameter importance estimation in model merging, along with a gradient approximation strategy to reduce computational complexity.

Findings

01

APL outperforms existing methods in in-domain and out-of-domain settings.

02

The gradient approximation reduces computational costs without sacrificing accuracy.

03

Experiments demonstrate improved conflict mitigation and model performance.

Abstract

Model merging combines multiple homologous models into one model, achieving convincing generalization without the necessity of additional training. A key challenge in this problem is resolving parameter redundancies and conflicts across multiple models. Existing models have demonstrated that dropping a portion of delta parameters can alleviate conflicts while maintaining performance. However, these methods often drop parameters either randomly or based on magnitude, overlooking task-specific information embedded in fine-tuned models. In this paper, we propose an Activated Parameter Locating (APL) method that utilizes causal intervention to estimate parameter importance, enabling more precise parameter drops and better conflict mitigation. Moreover, to reduce the computational complexity associated with a large number of parameter partitions, we also introduce a theoretically supported…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Machine Learning and Algorithms