PrivLava: Synthesizing Relational Data with Foreign Keys under Differential Privacy
Kuntai Cai, Xiaokui Xiao, Graham Cormode

TL;DR
PrivLava is a novel method for generating synthetic multi-relational data with foreign keys under differential privacy, enabling accurate query answering while protecting sensitive information.
Contribution
It introduces the first approach to synthesize relational data with foreign keys under differential privacy, modeling data with graphical models including latent variables.
Findings
PrivLava outperforms existing methods in query accuracy.
Supports arbitrary foreign key structures in relational data.
Effectively handles mixed public and private relations.
Abstract
Answering database queries while preserving privacy is an important problem that has attracted considerable research attention in recent years. A canonical approach to this problem is to use synthetic data. That is, we replace the input database R with a synthetic database R* that preserves the characteristics of R, and use R* to answer queries. Existing solutions for relational data synthesis, however, either fail to provide strong privacy protection, or assume that R contains a single relation. In addition, it is challenging to extend the existing single-relation solutions to the case of multiple relations, because they are unable to model the complex correlations induced by the foreign keys. Therefore, multi-relational data synthesis with strong privacy guarantees is an open problem. In this paper, we address the above open problem by proposing PrivLava, the first solution for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Data Quality and Management
