Dependency-aware synthetic tabular data generation
Chaithra Umesh, Kristian Schultz, Manjunath Mahendra, Saptarshi Bej, Olaf Wolkenhauer

TL;DR
This paper introduces HFGF, a framework that improves the preservation of functional and logical dependencies in synthetic tabular data, enhancing data fidelity and utility in sensitive domains.
Contribution
The paper proposes HFGF, a hierarchical framework that reconstructs dependent features based on dependency rules, addressing the poor preservation of dependencies in existing models.
Findings
HFGF improves dependency preservation across multiple generative models.
HFGF enhances the structural fidelity of synthetic data.
Experiments show better downstream utility with HFGF.
Abstract
Synthetic tabular data is increasingly used in privacy-sensitive domains such as health care, but existing generative models often fail to preserve inter-attribute relationships. In particular, functional dependencies (FDs) and logical dependencies (LDs), which capture deterministic and rule-based associations between features, are rarely or often poorly retained in synthetic datasets. To address this research gap, we propose the Hierarchical Feature Generation Framework (HFGF) for synthetic tabular data generation. We created benchmark datasets with known dependencies to evaluate our proposed HFGF. The framework first generates independent features using any standard generative model, and then reconstructs dependent features based on predefined FD and LD rules. Our experiments on four benchmark datasets with varying sizes, feature imbalance, and dependency complexity demonstrate that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Machine Learning in Healthcare · Data Quality and Management
