Sens-Merging: Sensitivity-Guided Parameter Balancing for Merging Large Language Models
Shuqi Liu, Han Wu, Bowei He, Xiongwei Han, Mingxuan Yuan, Linqi, Song

TL;DR
Sens-Merging introduces a sensitivity-guided approach to optimize parameter balancing in large language model merging, significantly improving task performance and outperforming specialized models in various tasks.
Contribution
The paper proposes Sens-Merging, a novel sensitivity-based coefficient adjustment method that enhances existing model merging techniques by considering parameter importance within and across tasks.
Findings
Improves performance on multiple tasks including knowledge, reasoning, and code generation.
Enables merged models to outperform fine-tuned models in code generation.
Reveals trade-offs between task-specific and cross-task scalings.
Abstract
Recent advances in large language models have led to numerous task-specialized fine-tuned variants, creating a need for efficient model merging techniques that preserve specialized capabilities while avoiding costly retraining. While existing task vector-based merging methods show promise, they typically apply uniform coefficients across all parameters, overlooking varying parameter importance both within and across tasks. We present Sens-Merging, a sensitivity-guided coefficient adjustment method that enhances existing model merging techniques by operating at both task-specific and cross-task levels. Our method analyzes parameter sensitivity within individual tasks and evaluates cross-task transferability to determine optimal merging coefficients. Extensive experiments on Mistral 7B and LLaMA2-7B/13B models demonstrate that Sens-Merging significantly improves performance across general…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Multi-Agent Systems and Negotiation
