Generating Realistic Synthetic Relational Data through Graph Variational Autoencoders
Ciro Antonio Mami, Andrea Coser, Eric Medvet, Alexander T.P., Boudewijn, Marco Volpe, Michael Whitworth, Borut Svara, Gabriele Sgroi,, Daniele Panfilo, Sebastiano Saccani

TL;DR
This paper introduces a novel method combining graph neural networks with variational autoencoders to generate realistic synthetic relational databases, effectively preserving complex data structures in various industries.
Contribution
It presents the first application of graph variational autoencoders for realistic synthetic relational data generation, addressing a gap in current research.
Findings
Synthetic data closely preserves original database structures
Method performs well on large datasets with complex data types
Generated datasets are suitable for privacy-preserving data sharing
Abstract
Synthetic data generation has recently gained widespread attention as a more reliable alternative to traditional data anonymization. The involved methods are originally developed for image synthesis. Hence, their application to the typically tabular and relational datasets from healthcare, finance and other industries is non-trivial. While substantial research has been devoted to the generation of realistic tabular datasets, the study of synthetic relational databases is still in its infancy. In this paper, we combine the variational autoencoder framework with graph neural networks to generate realistic synthetic relational databases. We then apply the obtained method to two publicly available databases in computational experiments. The results indicate that real databases' structures are accurately preserved in the resulting synthetic datasets, even for large datasets with advanced…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEpigenetics and DNA Methylation · Advanced Graph Neural Networks · Privacy-Preserving Technologies in Data
