PrivLava: Synthesizing Relational Data with Foreign Keys under   Differential Privacy

Kuntai Cai; Xiaokui Xiao; Graham Cormode

arXiv:2304.04545·cs.DB·April 11, 2023·5 cites

PrivLava: Synthesizing Relational Data with Foreign Keys under Differential Privacy

Kuntai Cai, Xiaokui Xiao, Graham Cormode

PDF

Open Access

TL;DR

PrivLava is a novel method for generating synthetic multi-relational data with foreign keys under differential privacy, enabling accurate query answering while protecting sensitive information.

Contribution

It introduces the first approach to synthesize relational data with foreign keys under differential privacy, modeling data with graphical models including latent variables.

Findings

01

PrivLava outperforms existing methods in query accuracy.

02

Supports arbitrary foreign key structures in relational data.

03

Effectively handles mixed public and private relations.

Abstract

Answering database queries while preserving privacy is an important problem that has attracted considerable research attention in recent years. A canonical approach to this problem is to use synthetic data. That is, we replace the input database R with a synthetic database R* that preserves the characteristics of R, and use R* to answer queries. Existing solutions for relational data synthesis, however, either fail to provide strong privacy protection, or assume that R contains a single relation. In addition, it is challenging to extend the existing single-relation solutions to the case of multiple relations, because they are unable to model the complex correlations induced by the foreign keys. Therefore, multi-relational data synthesis with strong privacy guarantees is an open problem. In this paper, we address the above open problem by proposing PrivLava, the first solution for…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPrivacy-Preserving Technologies in Data · Data Quality and Management