Evaluating the Feasibility of a Provably Secure Privacy-Preserving Entity Resolution Adaptation of PPJoin using Homomorphic Encryption
Tanmay Ghai, Yixiang Yao, Srivatsan Ravi, Pedro Szekely

TL;DR
This paper presents HE-PPJoin, a homomorphic encryption-based adaptation of the efficient PPJoin algorithm for privacy-preserving entity resolution, demonstrating its accuracy, efficiency, and privacy advantages over existing methods.
Contribution
We introduce HE-PPJoin, a novel homomorphic encryption adaptation of PPJoin, with detailed data structure modifications and algorithmic enhancements for privacy and correctness.
Findings
HE-PPJoin achieves comparable accuracy to the original PPJoin.
HE-PPJoin demonstrates improved privacy over fingerprinting methods.
Overhead analysis shows acceptable performance for practical use.
Abstract
Entity resolution is the task of disambiguating records that refer to the same entity in the real world. In this work, we explore adapting one of the most efficient and accurate Jaccard-based entity resolution algorithms - PPJoin, to the private domain via homomorphic encryption. Towards this, we present our precise adaptation of PPJoin (HE-PPJoin) that details certain subtle data structure modifications and algorithmic additions needed for correctness and privacy. We implement HE-PPJoin by extending the PALISADE homomorphic encryption library and evaluate over it for accuracy and incurred overhead. Furthermore, we directly compare HE-PPJoin against P4Join, an existing privacy-preserving variant of PPJoin which uses fingerprinting for raw content obfuscation, by demonstrating a rigorous analysis of the efficiency, accuracy, and privacy properties achieved by our adaptation as well as a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Data Quality and Management · Cryptography and Data Security
