Mitigating the Backdoor Effect for Multi-Task Model Merging via Safety-Aware Subspace
Jinluan Yang, Anke Tang, Didi Zhu, Zhengyu Chen, Li Shen, Fei Wu

TL;DR
This paper introduces a defense-aware model merging method that reduces backdoor vulnerabilities in multi-task models by using dual masks to balance task performance and security, demonstrating improved robustness against attacks.
Contribution
The paper proposes a novel meta-learning-based merging approach with dual masks to mitigate backdoor effects while maintaining task performance.
Findings
Reduces backdoor attack success rate by 2-10 percentage points.
Maintains only about 1% accuracy loss compared to non-secure merging.
Demonstrates robustness across various backdoor attack types.
Abstract
Model merging has gained significant attention as a cost-effective approach to integrate multiple single-task fine-tuned models into a unified one that can perform well on multiple tasks. However, existing model merging techniques primarily focus on resolving conflicts between task-specific models, they often overlook potential security threats, particularly the risk of backdoor attacks in the open-source model ecosystem. In this paper, we first investigate the vulnerabilities of existing model merging methods to backdoor attacks, identifying two critical challenges: backdoor succession and backdoor transfer. To address these issues, we propose a novel Defense-Aware Merging (DAM) approach that simultaneously mitigates task interference and backdoor vulnerabilities. Specifically, DAM employs a meta-learning-based optimization method with dual masks to identify a shared and safety-aware…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAccess Control and Trust · Business Process Modeling and Analysis · Simulation Techniques and Applications
MethodsSoftmax · Attention Is All You Need · Focus
