TL;DR
This paper presents a comprehensive framework for tabular data generative modeling, introducing a new loss function, a multi-objective Bayesian optimization strategy, and extensive benchmarking to improve data fidelity and hyperparameter tuning.
Contribution
It introduces a novel correlation-aware loss function, a multi-objective Bayesian optimization method, and a unified benchmarking framework for tabular generative models.
Findings
The correlation-aware loss improves synthetic data quality and downstream ML performance.
IORBO outperforms standard Bayesian optimization in hyperparameter tuning.
The framework achieves better data fidelity and hyperparameter optimization across 20 datasets.
Abstract
Deep learning (DL) models require extensive data to achieve strong performance and generalization. Deep generative models (DGMs) offer a solution by synthesizing data. Yet current approaches for tabular data often fail to preserve feature correlations and distributions during training, struggle with multi-metric hyperparameter selection, and lack comprehensive evaluation protocols. We address this gap with a unified framework that integrates training, hyperparameter tuning, and evaluation. First, we introduce a novel correlation- and distribution-aware loss function that regularizes DGMs, enhancing their ability to generate synthetic tabular data that faithfully represents the underlying data distributions. Theoretical analysis establishes stability and consistency guarantees. To enable principled hyperparameter search via Bayesian optimization (BO), we also propose a new…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
