Auto-FlexSwitch: Efficient Dynamic Model Merging via Learnable Task Vector Compression

Junqi Gao; Dazhi Zhang; Zhichang Guo; Biqing Qi; Yi Ran; Wangmeng Zuo

arXiv:2604.28109·cs.LG·May 1, 2026

Auto-FlexSwitch: Efficient Dynamic Model Merging via Learnable Task Vector Compression

Junqi Gao, Dazhi Zhang, Zhichang Guo, Biqing Qi, Yi Ran, Wangmeng Zuo

PDF

TL;DR

Auto-FlexSwitch introduces a learnable, highly efficient dynamic model merging method that compresses task vectors for multi-task adaptation, reducing storage and maintaining high performance.

Contribution

It proposes a novel, training-free and learnable framework for dynamic model merging using task vector compression and adaptive strategies.

Findings

01

Task vectors exhibit impulse-like activation patterns and robustness to low-bit representations.

02

The proposed methods achieve high-fidelity approximation at high compression ratios.

03

Auto-FlexSwitch enables efficient, dynamic multi-task model merging with optimized storage.

Abstract

Model merging has attracted attention as an effective path toward multi-task adaptation by integrating knowledge from multiple task-specific models. Among existing approaches, dynamic merging mitigates performance degradation caused by conflicting parameter updates across tasks by flexibly combining task-specific parameters at inference time, thereby maintaining high performance. However, these methods require storing independent parameters for each task, resulting in prohibitive storage overhead. To address this issue, we first experimentally demonstrate that the fine-tuned weight increments (referred to as task vectors) exhibit an impulse-like activation pattern and high robustness to low-bit representations. Driven by this insight, we propose T-Switch, which decomposes task vectors into three compact components: a binary sparse mask, a sign vector, and a scalar scaling factor,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.