Nosy Layers, Noisy Fixes: Tackling DRAs in Federated Learning Systems using Explainable AI
Meghali Nandi, Arash Shaghaghi, Nazatul Haque Sultan, Gustavo Batista, Raymond K. Zhao, Sanjay Jha

TL;DR
This paper introduces DRArmor, a novel defense mechanism for federated learning that uses Explainable AI to detect and mitigate malicious layers responsible for data reconstruction attacks, effectively protecting privacy while maintaining model accuracy.
Contribution
DRArmor uniquely identifies malicious layers within models using Explainable AI, applying targeted defenses to reduce data leakage in federated learning systems.
Findings
High detection rates with TPR of 0.910 and TNR of 0.890.
Achieves 87% average model accuracy.
Reduces data leakage by 62.5% on datasets with 500 samples per client.
Abstract
Federated Learning (FL) has emerged as a powerful paradigm for collaborative model training while keeping client data decentralized and private. However, it is vulnerable to Data Reconstruction Attacks (DRA) such as "LoKI" and "Robbing the Fed", where malicious models sent from the server to the client can reconstruct sensitive user data. To counter this, we introduce DRArmor, a novel defense mechanism that integrates Explainable AI with targeted detection and mitigation strategies for DRA. Unlike existing defenses that focus on the entire model, DRArmor identifies and addresses the root cause (i.e., malicious layers within the model that send gradients with malicious intent) by analyzing their contribution to the output and detecting inconsistencies in gradient values. Once these malicious layers are identified, DRArmor applies defense techniques such as noise injection, pixelation,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Adversarial Robustness in Machine Learning · Explainable Artificial Intelligence (XAI)
MethodsFocus · Pruning · Dynamic Range Activator
