Towards Backdoor Stealthiness in Model Parameter Space
Xiaoyun Xu, Zhuoran Liu, Stefanos Koffas, Stjepan Picek

TL;DR
This paper reveals that backdoor attacks stealthy in input and feature spaces can be detected in parameter space, and introduces Grond, a novel supply-chain attack that enhances stealthiness and effectiveness across multiple defenses.
Contribution
It uncovers a vulnerability in current backdoor attacks in parameter space and proposes Grond with ABI to improve stealthiness and attack success against diverse defenses.
Findings
Backdoor attacks in parameter space can be mitigated by examining model parameters.
Grond outperforms existing attacks against state-of-the-art defenses on multiple datasets.
ABI enhances the effectiveness of various backdoor attacks.
Abstract
Recent research on backdoor stealthiness focuses mainly on indistinguishable triggers in input space and inseparable backdoor representations in feature space, aiming to circumvent backdoor defenses that examine these respective spaces. However, existing backdoor attacks are typically designed to resist a specific type of backdoor defense without considering the diverse range of defense mechanisms. Based on this observation, we pose a natural question: Are current backdoor attacks truly a real-world threat when facing diverse practical defenses? To answer this question, we examine 12 common backdoor attacks that focus on input-space or feature-space stealthiness and 17 diverse representative defenses. Surprisingly, we reveal a critical blind spot: Backdoor attacks designed to be stealthy in input and feature spaces can be mitigated by examining backdoored models in parameter space. To…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications
MethodsFocus
