Learning from End User Data with Shuffled Differential Privacy over   Kernel Densities

Tal Wagner

arXiv:2502.14087·cs.LG·February 21, 2025

Learning from End User Data with Shuffled Differential Privacy over Kernel Densities

Tal Wagner

PDF

Open Access 1 Video

TL;DR

This paper introduces a shuffled differential privacy protocol for estimating kernel density functions from distributed end user data, achieving accuracy close to central DP and enabling private classification and semantic understanding.

Contribution

It presents a novel shuffled DP method for kernel density estimation that rivals central DP accuracy and supports private classification and semantic content recovery.

Findings

01

Achieves kernel density estimation accuracy comparable to central DP

02

Enables private classification using learned density functions

03

Demonstrates practical effectiveness through experiments

Abstract

We study a setting of collecting and learning from private data distributed across end users. In the shuffled model of differential privacy, the end users partially protect their data locally before sharing it, and their data is also anonymized during its collection to enhance privacy. This model has recently become a prominent alternative to central DP, which requires full trust in a central data curator, and local DP, where fully local data protection takes a steep toll on downstream accuracy. Our main technical result is a shuffled DP protocol for privately estimating the kernel density function of a distributed dataset, with accuracy essentially matching central DP. We use it to privately learn a classifier from the end user data, by learning a private density function per class. Moreover, we show that the density function itself can recover the semantic content of its class,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Learning from End User Data with Shuffled Differential Privacy over Kernel Densities· slideslive

Taxonomy

TopicsPrivacy-Preserving Technologies in Data · Recommender Systems and Techniques · Stochastic Gradient Optimization Techniques