Navigating Tabular Data Synthesis Research: Understanding User Needs and Tool Capabilities
Maria F. Davila R., Sven Groen, Fabian Panse, Wolfram, Wingerath

TL;DR
This paper surveys current methods for synthesizing tabular data, analyzing user needs, tool capabilities, and challenges, and provides a decision guide to help select appropriate tools while highlighting research gaps.
Contribution
It offers a comprehensive survey of TDS tools, defines user requirements, evaluates tool performance, and introduces a decision guide to aid tool selection and identify research gaps.
Findings
Evaluation of 36 TDS tools against user requirements
Identification of key challenges in tabular data synthesis
Development of a decision guide for tool selection
Abstract
In an era of rapidly advancing data-driven applications, there is a growing demand for data in both research and practice. Synthetic data have emerged as an alternative when no real data is available (e.g., due to privacy regulations). Synthesizing tabular data presents unique and complex challenges, especially handling (i) missing values, (ii) dataset imbalance, (iii) diverse column types, and (iv) complex data distributions, as well as preserving (i) column correlations, (ii) temporal dependencies, and (iii) integrity constraints (e.g., functional dependencies) present in the original dataset. While substantial progress has been made recently in the context of generational models, there is no one-size-fits-all solution for tabular data today, and choosing the right tool for a given task is therefore no trivial task. In this paper, we survey the state of the art in Tabular Data…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Visualization and Analytics · Semantic Web and Ontologies · Big Data and Business Intelligence
MethodsSparse Evolutionary Training
