BadMerging: Backdoor Attacks Against Model Merging
Jinghuai Zhang, Jianfeng Chi, Zheng Li, Kunlin Cai, Yang Zhang and, Yuan Tian

TL;DR
This paper introduces BadMerging, a novel backdoor attack targeting model merging techniques, demonstrating its effectiveness and exposing security vulnerabilities in current model merging practices.
Contribution
It presents the first backdoor attack specifically designed for model merging, with a two-stage mechanism and feature-interpolation loss to improve robustness against merging variations.
Findings
BadMerging successfully compromises merged models across various algorithms.
The attack remains effective despite different merging parameters and task domains.
Existing defenses fail to detect or mitigate BadMerging attacks.
Abstract
Fine-tuning pre-trained models for downstream tasks has led to a proliferation of open-sourced task-specific models. Recently, Model Merging (MM) has emerged as an effective approach to facilitate knowledge transfer among these independently fine-tuned models. MM directly combines multiple fine-tuned task-specific models into a merged model without additional training, and the resulting model shows enhanced capabilities in multiple tasks. Although MM provides great utility, it may come with security risks because an adversary can exploit MM to affect multiple downstream tasks. However, the security risks of MM have barely been studied. In this paper, we first find that MM, as a new learning paradigm, introduces unique challenges for existing backdoor attacks due to the merging process. To address these challenges, we introduce BadMerging, the first backdoor attack specifically designed…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAccess Control and Trust · Security and Verification in Computing · Formal Methods in Verification
