Scalable Multi-Database Privacy-Preserving Record Linkage using Counting Bloom Filters
Dinusha Vatsalan, Peter Christen, and Erhard Rahm

TL;DR
This paper introduces a scalable, privacy-preserving method for linking multiple databases using Counting Bloom Filters, addressing challenges in computation, communication, and privacy in multi-party scenarios.
Contribution
It proposes a novel CBF-based encoding technique for multi-party PPRL and explores optimizations to enhance scalability, privacy, and efficiency.
Findings
Demonstrates the scalability of the approach with real datasets
Shows improved privacy protection over existing methods
Achieves efficient linkage with reduced communication costs
Abstract
Privacy-preserving record linkage (PPRL) aims at integrating sensitive information from multiple disparate databases of different organizations. PPRL approaches are increasingly required in real-world application areas such as healthcare, national security, and business. Previous approaches have mostly focused on linking only two databases as well as the use of a dedicated linkage unit. Scaling PPRL to more databases (multi-party PPRL) is an open challenge since privacy threats as well as the computation and communication costs for record linkage increase significantly with the number of databases. We thus propose the use of a new encoding method of sensitive data based on Counting Bloom Filters (CBF) to improve privacy for multi-party PPRL. We also investigate optimizations to reduce communication and computation costs for CBF-based multi-party PPRL with and without the use of a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Quality and Management · Privacy-Preserving Technologies in Data · Cloud Data Security Solutions
