Revisiting Backdoor Threat in Federated Instruction Tuning from a Signal Aggregation Perspective
Haodong Zhao, Jinming Hu, Gongshen Liu

TL;DR
This paper reveals a significant backdoor vulnerability in federated instruction tuning caused by low-concentration poisoned data across benign clients, which existing defenses fail to detect, posing a serious security threat.
Contribution
It introduces a new perspective on backdoor threats in federated learning, modeling the backdoor signal from a signal aggregation view and demonstrating the ineffectiveness of current defenses.
Findings
Less than 10% poisoned data can cause over 85% attack success rate.
State-of-the-art defenses are ineffective against this distributed backdoor threat.
The backdoor signal can be quantified using the Backdoor Signal-to-Noise Ratio.
Abstract
Federated learning security research has predominantly focused on backdoor threats from a minority of malicious clients that intentionally corrupt model updates. This paper challenges this paradigm by investigating a more pervasive and insidious threat: \textit{backdoor vulnerabilities from low-concentration poisoned data distributed across the datasets of benign clients.} This scenario is increasingly common in federated instruction tuning for language models, which often rely on unverified third-party and crowd-sourced data. We analyze two forms of backdoor data through real cases: 1) \textit{natural trigger (inherent features as implicit triggers)}; 2) \textit{adversary-injected trigger}. To analyze this threat, we model the backdoor implantation process from signal aggregation, proposing the Backdoor Signal-to-Noise Ratio to quantify the dynamics of the distributed backdoor signal.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Privacy-Preserving Technologies in Data · Advanced Malware Detection Techniques
