Disjoint Generative Models
Anton Danholt Lautrup, Muhammad Rajabinasab, Tobias Hyrup, Arthur Zimek, Peter Schneider-Kamp

TL;DR
This paper introduces a novel framework for generating synthetic datasets using disjoint generative models, enhancing privacy with minimal utility loss, demonstrated through various tabular data case studies.
Contribution
The paper presents a new disjoint generative modeling framework that improves privacy and allows for flexible, mixed-model synthesis without requiring shared variables.
Findings
Significantly increased privacy with low utility loss
Effective for certain model types and mixed-model synthesis
Validated through multiple case studies on tabular data
Abstract
We propose a new framework for generating cross-sectional synthetic datasets via disjoint generative models. In this paradigm, a dataset is partitioned into disjoint subsets that are supplied to separate instances of generative models. The results are then combined post hoc by a joining operation that works in the absence of common variables/identifiers. The success of the framework is demonstrated through several case studies and examples on tabular data that helps illuminate some of the design choices that one may make. The principal benefit of disjoint generative models is significantly increased privacy at only a low utility cost. Additional findings include increased effectiveness and feasibility for certain model types and the possibility for mixed-model synthesis.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
