Sketches-based join size estimation under local differential privacy

Meifan Zhang; Xin Liu; Lihua Yin

arXiv:2405.11419·cs.DB·May 21, 2024

Sketches-based join size estimation under local differential privacy

Meifan Zhang, Xin Liu, Lihua Yin

PDF

Open Access

TL;DR

This paper introduces LDPJoinSketch and LDPJoinSketch+ algorithms that improve join size estimation accuracy under local differential privacy by reducing noise and hash-collision errors using sketch-based methods.

Contribution

The paper presents novel sketch-based algorithms, LDPJoinSketch and LDPJoinSketch+, that effectively reduce errors in join size estimation under local differential privacy.

Findings

01

Outperforms existing methods in accuracy

02

Effectively reduces noise and hash-collision errors

03

Estimation error bounds are satisfied under LDP

Abstract

Join size estimation on sensitive data poses a risk of privacy leakage. Local differential privacy (LDP) is a solution to preserve privacy while collecting sensitive data, but it introduces significant noise when dealing with sensitive join attributes that have large domains. Employing probabilistic structures such as sketches is a way to handle large domains, but it leads to hash-collision errors. To achieve accurate estimations, it is necessary to reduce both the noise error and hash-collision error. To tackle the noise error caused by protecting sensitive join values with large domains, we introduce a novel algorithm called LDPJoinSketch for sketch-based join size estimation under LDP. Additionally, to address the inherent hash-collision errors in sketches under LDP, we propose an enhanced method called LDPJoinSketch+. It utilizes a frequency-aware perturbation mechanism that…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPrivacy-Preserving Technologies in Data · Internet Traffic Analysis and Secure E-voting · Privacy, Security, and Data Protection