Multi-Layer Privacy-Preserving Record Linkage with Clerical Review based on gradual information disclosure
Florens Rohde, Victor Christen, Martin Franke, Erhard Rahm

TL;DR
This paper introduces a multi-layer privacy-preserving record linkage protocol that uses clerical review and active learning to improve linkage quality while minimizing information disclosure and respecting data sovereignty.
Contribution
It presents a novel multi-layer active learning protocol integrating clerical review into privacy-preserving record linkage, enhancing accuracy with limited privacy risks.
Findings
Significant improvement in linkage quality demonstrated on real-world datasets.
Reduced privacy risks through multi-layer information disclosure.
Efficient use of limited labeling effort with active learning.
Abstract
Privacy-Preserving Record linkage (PPRL) is an essential component in data integration tasks of sensitive information. The linkage quality determines the usability of combined datasets and (machine learning) applications based on them. We present a novel privacy-preserving protocol that integrates clerical review in PPRL using a multi-layer active learning process. Uncertain match candidates are reviewed on several layers by human and non-human oracles to reduce the amount of disclosed information per record and in total. Predictions are propagated back to update previous layers, resulting in an improved linkage performance for non-reviewed candidates as well. The data owners remain in control of the amount of information they share for each record. Therefore, our approach follows need-to-know and data sovereignty principles. The experimental evaluation on real-world datasets shows…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
