Synthetic Tabular Data Generation: A Comparative Survey for Modern Techniques

Raju Challagundla; Mohsen Dorodchi; Pu Wang; Minwoo Lee

arXiv:2507.11590·cs.LG·July 17, 2025

Synthetic Tabular Data Generation: A Comparative Survey for Modern Techniques

Raju Challagundla, Mohsen Dorodchi, Pu Wang, Minwoo Lee

PDF

Open Access

TL;DR

This survey reviews recent methods for generating synthetic tabular data, focusing on preserving data utility, privacy, and complex feature relationships, and introduces a taxonomy and benchmark framework to guide future research and practical applications.

Contribution

It introduces a novel taxonomy based on generation objectives and proposes a benchmark framework, bridging theoretical methods with real-world privacy and utility needs.

Findings

01

Highlights the importance of preserving feature relationships

02

Emphasizes privacy guarantees in synthetic data generation

03

Provides a benchmark for evaluating synthetic tabular data methods

Abstract

As privacy regulations become more stringent and access to real-world data becomes increasingly constrained, synthetic data generation has emerged as a vital solution, especially for tabular datasets, which are central to domains like finance, healthcare and the social sciences. This survey presents a comprehensive and focused review of recent advances in synthetic tabular data generation, emphasizing methods that preserve complex feature relationships, maintain statistical fidelity, and satisfy privacy requirements. A key contribution of this work is the introduction of a novel taxonomy based on practical generation objectives, including intended downstream applications, privacy guarantees, and data utility, directly informing methodological design and evaluation strategies. Therefore, this review prioritizes the actionable goals that drive synthetic data creation, including…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsData Management and Algorithms