Local Differentially Private Frequency Estimation based on Learned   Sketches

Meifan Zhang; Sixin Lin; Lihua Yin

arXiv:2211.01138·cs.CR·November 22, 2022

Local Differentially Private Frequency Estimation based on Learned Sketches

Meifan Zhang, Sixin Lin, Lihua Yin

PDF

Open Access

TL;DR

This paper introduces a two-phase local differential privacy framework using learned sketches to improve frequency estimation accuracy for large domain data, especially for low-frequent items.

Contribution

It presents a novel two-phase LDP-based learned sketch method that separates high and low frequency items to reduce hash collision errors, outperforming existing methods.

Findings

01

The proposed method satisfies LDP guarantees.

02

It achieves higher accuracy than Apple-CMS, Apple-HCMS, and FLH.

03

Experimental results confirm improved performance.

Abstract

Sketches are widely used for frequency estimation of data with a large domain. However, sketches-based frequency estimation faces more challenges when considering privacy. Local differential privacy (LDP) is a solution to frequency estimation on sensitive data while preserving the privacy. LDP enables each user to perturb its data on the client-side to protect the privacy, but it also introduces errors to the frequency estimations. The hash collisions in the sketches make the estimations for low-frequent items even worse. In this paper, we propose a two-phase frequency estimation framework for data with a large domain based on an LDP learned sketch, which separates the high-frequent and low-frequent items to avoid the errors caused by hash collisions. We theoretically proved that the proposed method satisfies LDP and it is more accurate than the state-of-the-art frequency estimation…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAnomaly Detection Techniques and Applications · Privacy-Preserving Technologies in Data · Music and Audio Processing