Labeling without Seeing? Blind Annotation for Privacy-Preserving Entity Resolution
Yixiang Yao, Weizhao Jin, Srivatsan Ravi

TL;DR
This paper introduces a privacy-preserving blind annotation protocol for entity resolution using homomorphic encryption, enabling collaborative labeling without exposing sensitive data, and demonstrates its feasibility with high accuracy in experiments.
Contribution
It presents the first privacy-preserving ground truth generation method for entity resolution using homomorphic encryption and a domain-specific language for ease of use.
Findings
Achieves over 90% f-measure compared to real ground truths.
Provides rigorous privacy guarantees for data owners.
Demonstrates practical feasibility through empirical experiments.
Abstract
The entity resolution problem requires finding pairs across datasets that belong to different owners but refer to the same entity in the real world. To train and evaluate solutions (either rule-based or machine-learning-based) to the entity resolution problem, generating a ground truth dataset with entity pairs or clusters is needed. However, such a data annotation process involves humans as domain oracles to review the plaintext data for all candidate record pairs from different parties, which inevitably infringes the privacy of data owners, especially in privacy-sensitive cases like medical records. To the best of our knowledge, there is no prior work on privacy-preserving ground truth dataset generation, especially in the domain of entity resolution. We propose a novel blind annotation protocol based on homomorphic encryption that allows domain oracles to collaboratively label ground…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Data Quality and Management · Cryptography and Data Security
