# Pointwise adaptive kernel density estimation under local approximate   differential privacy

**Authors:** Martin Kroll

arXiv: 1907.06233 · 2019-07-16

## TL;DR

This paper develops a pointwise adaptive kernel density estimator under local approximate differential privacy, achieving near-optimal convergence rates while ensuring data privacy at the individual level.

## Contribution

It introduces a privacy-preserving kernel density estimation method with adaptive bandwidth selection, providing theoretical guarantees and optimal convergence rates under local differential privacy.

## Key findings

- Optimal convergence rate under privacy: n^{-(2s-1)/(2s+1)}
- Adaptive estimator attains near-optimal rate with logarithmic factors
- Method compatible with multiple statistical procedures in privacy-preserving data analysis

## Abstract

We consider non-parametric density estimation in the framework of local approximate differential privacy. In contrast to centralized privacy scenarios with a trusted curator, in the local setup anonymization must be guaranteed already on the individual data owners' side and therefore must precede any data mining tasks. Thus, the published anonymized data should be compatible with as many statistical procedures as possible. We suggest adding Laplace noise and Gaussian processes (both appropriately scaled) to kernel density estimators to obtain approximate differential private versions of the latter ones. We obtain minimax type results over Sobolev classes indexed by a smoothness parameter $s>1/2$ for the mean squared error at a fixed point. In particular, we show that taking the average of private kernel density estimators from $n$ different data owners attains the optimal rate of convergence if the bandwidth parameter is correctly specified. Notably, the optimal convergence rate in terms of the sample size $n$ is $n^{-(2s-1)/(2s+1)}$ under local differential privacy and thus deteriorated to the rate $n^{-(2s-1)/(2s)}$ which holds without privacy restrictions. Since the optimal choice of the bandwidth parameter depends on the smoothness $s$ and is thus not accessible in practice, adaptive methods for bandwidth selection are necessary and must, in the local privacy framework, be performed directly on the anonymized data. We address this problem by means of a variant of Lepski's method tailored to the privacy setup and obtain general oracle inequalities for private kernel density estimators. In the Sobolev case, the resulting adaptive estimator attains the optimal rate of convergence at least up to extra logarithmic factors.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1907.06233/full.md

## Figures

1 figure with captions in the complete paper: https://tomesphere.com/paper/1907.06233/full.md

## References

24 references — full list in the complete paper: https://tomesphere.com/paper/1907.06233/full.md

---
Source: https://tomesphere.com/paper/1907.06233