Information Leakage in Data Linkage
Peter Christen, Rainer Schnell, Anushka Vidanage

TL;DR
This paper examines potential information leaks in traditional and privacy-preserving data linkage methods, highlighting vulnerabilities and offering recommendations to enhance security in sensitive data integration.
Contribution
It provides a comprehensive analysis of leakage risks in PPRL protocols and offers practical guidelines to prevent vulnerabilities in real-world data linkage applications.
Findings
PPRL protocols can still leak sensitive information unintentionally.
Organizational challenges in implementing PPRL are significant.
Recommendations improve security and reduce leakage risks.
Abstract
The process of linking databases that contain sensitive information about individuals across organisations is an increasingly common requirement in the health and social science research domains, as well as with governments and businesses. To protect personal data, protocols have been developed to limit the leakage of sensitive information. Furthermore, privacy-preserving record linkage (PPRL) techniques have been proposed to conduct linkage on encoded data. While PPRL techniques are now being employed in real-world applications, the focus of PPRL research has been on the technical aspects of linking sensitive data (such as encoding methods and cryptanalysis attacks), but not on organisational challenges when employing such techniques in practice. We analyse what sensitive information can possibly leak, either unintentionally or intentionally, in traditional data linkage as well as PPRL…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
