Gradient-Leaks: Understanding and Controlling Deanonymization in   Federated Learning

Tribhuvanesh Orekondy; Seong Joon Oh; Yang Zhang; Bernt Schiele; Mario; Fritz

arXiv:1805.05838·cs.CR·September 15, 2020·20 cites

Gradient-Leaks: Understanding and Controlling Deanonymization in Federated Learning

Tribhuvanesh Orekondy, Seong Joon Oh, Yang Zhang, Bernt Schiele, Mario, Fritz

PDF

Open Access

TL;DR

This paper investigates how model updates in federated learning can reveal user identities, demonstrating deanonymization risks and proposing data-augmentation defenses that balance privacy and utility.

Contribution

It uncovers the encoding of user-specific information in model updates and introduces data-augmentation strategies to mitigate deanonymization risks.

Findings

01

Model updates encode subtle user-specific variations.

02

Adversaries can deanonymize devices with limited auxiliary data.

03

Data-augmentation strategies effectively reduce deanonymization risk.

Abstract

Federated Learning (FL) systems are gaining popularity as a solution to training Machine Learning (ML) models from large-scale user data collected on personal devices (e.g., smartphones) without their raw data leaving the device. At the core of FL is a network of anonymous user devices sharing training information (model parameter updates) computed locally on personal data. However, the type and degree to which user-specific information is encoded in the model updates is poorly understood. In this paper, we identify model updates encode subtle variations in which users capture and generate data. The variations provide a strong statistical signal, allowing an adversary to effectively deanonymize participating devices using a limited set of auxiliary data. We analyze resulting deanonymization attacks on diverse tasks on real-world (anonymized) user-generated data across a range of closed-…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPrivacy-Preserving Technologies in Data · Internet Traffic Analysis and Secure E-voting · Adversarial Robustness in Machine Learning