Hermite Polynomial Features for Private Data Generation

Margarita Vinaroz; Mohammad-Amin Charusaie; Frederik Harder; Kamil; Adamczewski; Mijung Park

arXiv:2106.05042·cs.LG·June 24, 2022·1 cites

Hermite Polynomial Features for Private Data Generation

Margarita Vinaroz, Mohammad-Amin Charusaie, Frederik Harder, Kamil, Adamczewski, Mijung Park

PDF

Open Access 1 Repo

TL;DR

This paper introduces Hermite polynomial features as a more efficient alternative to random features for differentially private data generation, improving the privacy-accuracy trade-off by requiring fewer features.

Contribution

The paper proposes replacing random features with Hermite polynomial features to better approximate kernel mean embeddings for private data generation.

Findings

01

Hermite polynomial features outperform random Fourier features in privacy-accuracy trade-offs.

02

Fewer Hermite features are needed to achieve similar or better approximation accuracy.

03

Demonstrated effectiveness on tabular and image datasets.

Abstract

Kernel mean embedding is a useful tool to represent and compare probability measures. Despite its usefulness, kernel mean embedding considers infinite-dimensional features, which are challenging to handle in the context of differentially private data generation. A recent work proposes to approximate the kernel mean embedding of data distribution using finite-dimensional random features, which yields analytically tractable sensitivity. However, the number of required random features is excessively high, often ten thousand to a hundred thousand, which worsens the privacy-accuracy trade-off. To improve the trade-off, we propose to replace random features with Hermite polynomial features. Unlike the random features, the Hermite polynomial features are ordered, where the features at the low orders contain more information on the distribution than those at the high orders. Hence, a relatively…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

parklabml/dp-hp
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPrivacy-Preserving Technologies in Data · Stochastic Gradient Optimization Techniques · Adversarial Robustness in Machine Learning