Privacy-preserving Deep Learning based Record Linkage

Thilina Ranbaduge; Dinusha Vatsalan; Ming Ding

arXiv:2211.02161·cs.CR·November 7, 2022

Privacy-preserving Deep Learning based Record Linkage

Thilina Ranbaduge, Dinusha Vatsalan, Ming Ding

PDF

Open Access

TL;DR

This paper introduces a novel multi-party privacy-preserving deep learning protocol for record linkage that ensures data privacy while maintaining high linkage accuracy across multiple organizations.

Contribution

It presents the first deep learning-based multi-party PPRL protocol utilizing differential privacy and secure aggregation for effective and private record linkage.

Findings

01

Achieves high linkage quality on large real-world databases.

02

Provides provable privacy protection against re-identification attacks.

03

Demonstrates scalability and effectiveness of the approach.

Abstract

Deep learning-based linkage of records across different databases is becoming increasingly useful in data integration and mining applications to discover new insights from multiple sources of data. However, due to privacy and confidentiality concerns, organisations often are not willing or allowed to share their sensitive data with any external parties, thus making it challenging to build/train deep learning models for record linkage across different organizations' databases. To overcome this limitation, we propose the first deep learning-based multi-party privacy-preserving record linkage (PPRL) protocol that can be used to link sensitive databases held by multiple different organisations. In our approach, each database owner first trains a local deep learning model, which is then uploaded to a secure environment and securely aggregated to create a global model. The global model is…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsData Quality and Management · Privacy-Preserving Technologies in Data