Data Poisoning and Leakage Analysis in Federated Learning

Wenqi Wei; Tiansheng Huang; Zachary Yahn; Anoop Singhal; Margaret; Loper; and Ling Liu

arXiv:2409.13004·cs.LG·September 23, 2024

Data Poisoning and Leakage Analysis in Federated Learning

Wenqi Wei, Tiansheng Huang, Zachary Yahn, Anoop Singhal, Margaret, Loper, and Ling Liu

PDF

TL;DR

This paper analyzes privacy and poisoning threats in federated learning, exploring attack mechanisms, defenses like gradient noise perturbation, and the effectiveness of mitigation strategies through empirical evidence.

Contribution

It provides a comprehensive analysis of data leakage and poisoning threats in federated learning, proposing dynamic model perturbation for simultaneous privacy and security enhancement.

Findings

01

Gradient noise addition can mitigate data leakage effectively.

02

Poisoning attacks significantly degrade global model performance.

03

Dynamic perturbation balances privacy, poisoning resilience, and model accuracy.

Abstract

Data poisoning and leakage risks impede the massive deployment of federated learning in the real world. This chapter reveals the truths and pitfalls of understanding two dominating threats: {\em training data privacy intrusion} and {\em training data poisoning}. We first investigate training data privacy threat and present our observations on when and how training data may be leaked during the course of federated training. One promising defense strategy is to perturb the raw gradient update by adding some controlled randomized noise prior to sharing during each round of federated learning. We discuss the importance of determining the proper amount of randomized noise and the proper location to add such noise for effective mitigation of gradient leakage threats against training data privacy. Then we will review and compare different training data poisoning threats and analyze why and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.