Nosy Layers, Noisy Fixes: Tackling DRAs in Federated Learning Systems using Explainable AI

Meghali Nandi; Arash Shaghaghi; Nazatul Haque Sultan; Gustavo Batista; Raymond K. Zhao; Sanjay Jha

arXiv:2505.10942·cs.CR·May 19, 2025

Nosy Layers, Noisy Fixes: Tackling DRAs in Federated Learning Systems using Explainable AI

Meghali Nandi, Arash Shaghaghi, Nazatul Haque Sultan, Gustavo Batista, Raymond K. Zhao, Sanjay Jha

PDF

Open Access

TL;DR

This paper introduces DRArmor, a novel defense mechanism for federated learning that uses Explainable AI to detect and mitigate malicious layers responsible for data reconstruction attacks, effectively protecting privacy while maintaining model accuracy.

Contribution

DRArmor uniquely identifies malicious layers within models using Explainable AI, applying targeted defenses to reduce data leakage in federated learning systems.

Findings

01

High detection rates with TPR of 0.910 and TNR of 0.890.

02

Achieves 87% average model accuracy.

03

Reduces data leakage by 62.5% on datasets with 500 samples per client.

Abstract

Federated Learning (FL) has emerged as a powerful paradigm for collaborative model training while keeping client data decentralized and private. However, it is vulnerable to Data Reconstruction Attacks (DRA) such as "LoKI" and "Robbing the Fed", where malicious models sent from the server to the client can reconstruct sensitive user data. To counter this, we introduce DRArmor, a novel defense mechanism that integrates Explainable AI with targeted detection and mitigation strategies for DRA. Unlike existing defenses that focus on the entire model, DRArmor identifies and addresses the root cause (i.e., malicious layers within the model that send gradients with malicious intent) by analyzing their contribution to the output and detecting inconsistencies in gradient values. Once these malicious layers are identified, DRArmor applies defense techniques such as noise injection, pixelation,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPrivacy-Preserving Technologies in Data · Adversarial Robustness in Machine Learning · Explainable Artificial Intelligence (XAI)

MethodsFocus · Pruning · Dynamic Range Activator