DataLens: Scalable Privacy Preserving Training via Gradient Compression   and Aggregation

Boxin Wang; Fan Wu; Yunhui Long; Luka Rimanic; Ce Zhang; Bo Li

arXiv:2103.11109·cs.LG·March 29, 2022

DataLens: Scalable Privacy Preserving Training via Gradient Compression and Aggregation

Boxin Wang, Fan Wu, Yunhui Long, Luka Rimanic, Ce Zhang, Bo Li

PDF

2 Repos

TL;DR

DataLens introduces a scalable, privacy-preserving generative model that combines gradient compression and aggregation to enhance differential privacy and utility in training deep neural networks.

Contribution

It proposes a novel framework DATALENS with TOPAGG for privacy-preserving gradient compression and demonstrates improved privacy and utility over existing methods.

Findings

01

DATALENS guarantees differential privacy for generated data.

02

TOPAGG achieves higher utility than state-of-the-art DP SGD methods.

03

Extensive experiments show superior performance on multiple datasets.

Abstract

Recent success of deep neural networks (DNNs) hinges on the availability of large-scale dataset; however, training on such dataset often poses privacy risks for sensitive training information. In this paper, we aim to explore the power of generative models and gradient sparsity, and propose a scalable privacy-preserving generative model DATALENS. Comparing with the standard PATE privacy-preserving framework which allows teachers to vote on one-dimensional predictions, voting on the high dimensional gradient vectors is challenging in terms of privacy preservation. As dimension reduction techniques are required, we need to navigate a delicate tradeoff space between (1) the improvement of privacy preservation and (2) the slowdown of SGD convergence. To tackle this, we take advantage of communication efficient learning and propose a novel noise compression and aggregation approach TOPAGG by…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsStochastic Gradient Descent