Distributionally-Robust Machine Learning Using Locally   Differentially-Private Data

Farhad Farokhi

arXiv:2006.13488·cs.LG·June 25, 2020

Distributionally-Robust Machine Learning Using Locally Differentially-Private Data

Farhad Farokhi

PDF

TL;DR

This paper introduces a distributionally-robust approach to machine learning with locally-differentially private data, using Wasserstein distance to define ambiguity sets and deriving new regularizers for regression models, including exact solutions for Gaussian data.

Contribution

It formulates privacy-preserving machine learning as a distributionally-robust optimization problem and derives novel regularizers, including an exact solution for Gaussian data, advancing privacy-aware model training.

Findings

01

New regularizer for linear regression models.

02

Exact solution for Gaussian data case.

03

Demonstrated improved performance on practical datasets.

Abstract

We consider machine learning, particularly regression, using locally-differentially private datasets. The Wasserstein distance is used to define an ambiguity set centered at the empirical distribution of the dataset corrupted by local differential privacy noise. The ambiguity set is shown to contain the probability distribution of unperturbed, clean data. The radius of the ambiguity set is a function of the privacy budget, spread of the data, and the size of the problem. Hence, machine learning with locally-differentially private datasets can be rewritten as a distributionally-robust optimization. For general distributions, the distributionally-robust optimization problem can relaxed as a regularized machine learning problem with the Lipschitz constant of the machine learning model as a regularizer. For linear and logistic regression, this regularizer is the dual norm of the model…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsLinear Regression