Hermite Polynomial Features for Private Data Generation
Margarita Vinaroz, Mohammad-Amin Charusaie, Frederik Harder, Kamil, Adamczewski, Mijung Park

TL;DR
This paper introduces Hermite polynomial features as a more efficient alternative to random features for differentially private data generation, improving the privacy-accuracy trade-off by requiring fewer features.
Contribution
The paper proposes replacing random features with Hermite polynomial features to better approximate kernel mean embeddings for private data generation.
Findings
Hermite polynomial features outperform random Fourier features in privacy-accuracy trade-offs.
Fewer Hermite features are needed to achieve similar or better approximation accuracy.
Demonstrated effectiveness on tabular and image datasets.
Abstract
Kernel mean embedding is a useful tool to represent and compare probability measures. Despite its usefulness, kernel mean embedding considers infinite-dimensional features, which are challenging to handle in the context of differentially private data generation. A recent work proposes to approximate the kernel mean embedding of data distribution using finite-dimensional random features, which yields analytically tractable sensitivity. However, the number of required random features is excessively high, often ten thousand to a hundred thousand, which worsens the privacy-accuracy trade-off. To improve the trade-off, we propose to replace random features with Hermite polynomial features. Unlike the random features, the Hermite polynomial features are ordered, where the features at the low orders contain more information on the distribution than those at the high orders. Hence, a relatively…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Stochastic Gradient Optimization Techniques · Adversarial Robustness in Machine Learning
