FedSurrogate: Backdoor Defense in Federated Learning via Layer Criticality and Surrogate Replacement

Fatima Z. Abacha; Sin G. Teo; Yuanxiang Wu; Lucas C. Cordeiro; Mustafa A. Mustafa

arXiv:2605.11122·cs.CR·May 13, 2026

FedSurrogate: Backdoor Defense in Federated Learning via Layer Criticality and Surrogate Replacement

Fatima Z. Abacha, Sin G. Teo, Yuanxiang Wu, Lucas C. Cordeiro, Mustafa A. Mustafa

PDF

TL;DR

FedSurrogate is a novel federated learning defense that reduces false positives and neutralizes backdoor attacks by combining gradient filtering, layer-criticality analysis, and surrogate replacement.

Contribution

It introduces a new backdoor defense method that effectively balances attack mitigation with low false-positive rates in non-IID federated learning scenarios.

Findings

01

Maintains false-positive rates below 10% across datasets and attacks.

02

Achieves attack success rates below 2.1% in non-IID settings.

03

Outperforms baseline methods in accuracy and security metrics.

Abstract

Federated Learning remains highly susceptible to backdoor attacks--malicious clients inject targeted behaviours into the global model. Existing defenses suffer from substantial false-positive rates under realistic non-independent and identically distributed (non-IID) data, incorrectly flagging benign clients and degrading model accuracy even when adversaries are correctly identified. We present FedSurrogate, a novel backdoor defense that addresses this limitation by combining bidirectional gradient alignment filtering with layer-adaptive anomaly detection. FedSurrogate performs selective clustering on security-critical layers identified via directional divergence analysis, concentrating the detection signal on a low-dimensional subspace. A bidirectional soft-filtering stage screens trusted clients for residual contamination while rescuing false positives from suspects, substantially…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.