On the Importance of Architecture and Feature Selection in   Differentially Private Machine Learning

Wenxuan Bao; Luke A. Bauer; and Vincent Bindschaedler

arXiv:2205.06720·cs.CR·May 16, 2022·1 cites

On the Importance of Architecture and Feature Selection in Differentially Private Machine Learning

Wenxuan Bao, Luke A. Bauer, and Vincent Bindschaedler

PDF

Open Access

TL;DR

This paper highlights that naive application of differential privacy in machine learning can lead to overly complex, poorly performing models, and proposes methods to incorporate privacy considerations into feature and architecture selection.

Contribution

It introduces a theoretical framework and empirical evidence showing the impact of DP noise on model complexity and performance, and proposes privacy-aware feature and architecture selection algorithms.

Findings

01

Naive DP application results in overly complex models.

02

Accounting for DP noise allows for simpler, more accurate models.

03

Proposed algorithms improve privacy-utility trade-offs.

Abstract

We study a pitfall in the typical workflow for differentially private machine learning. The use of differentially private learning algorithms in a "drop-in" fashion -- without accounting for the impact of differential privacy (DP) noise when choosing what feature engineering operations to use, what features to select, or what neural network architecture to use -- yields overly complex and poorly performing models. In other words, by anticipating the impact of DP noise, a simpler and more accurate alternative model could have been trained for the same privacy guarantee. We systematically study this phenomenon through theory and experiments. On the theory front, we provide an explanatory framework and prove that the phenomenon arises naturally from the addition of noise to satisfy differential privacy. On the experimental front, we demonstrate how the phenomenon manifests in practice…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPrivacy-Preserving Technologies in Data · Adversarial Robustness in Machine Learning · Stochastic Gradient Optimization Techniques

MethodsFeature Selection