A supervised generative optimization approach for tabular data

Shinpei Nakamura-Sakai; Fadi Hamad; Saheed Obitayo; Vamsi K. Potluru

arXiv:2309.05079·cs.LG·May 13, 2024

A supervised generative optimization approach for tabular data

Shinpei Nakamura-Sakai, Fadi Hamad, Saheed Obitayo, Vamsi K. Potluru

PDF

Open Access

TL;DR

This paper introduces a supervised generative optimization framework for creating synthetic tabular data, incorporating downstream task information and meta-learning to improve data utility and relevance.

Contribution

It proposes a novel framework that combines supervised learning and meta-learning to optimize synthetic data generation for specific downstream tasks.

Findings

01

Enhanced synthetic data relevance for downstream tasks

02

Meta-learning improves distribution mixture selection

03

Framework outperforms unsupervised methods in utility

Abstract

Synthetic data generation has emerged as a crucial topic for financial institutions, driven by multiple factors, such as privacy protection and data augmentation. Many algorithms have been proposed for synthetic data generation but reaching the consensus on which method we should use for the specific data sets and use cases remains challenging. Moreover, the majority of existing approaches are ``unsupervised'' in the sense that they do not take into account the downstream task. To address these issues, this work presents a novel synthetic data generation framework. The framework integrates a supervised component tailored to the specific downstream task and employs a meta-learning approach to learn the optimal mixture distribution of existing synthetic distributions.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsBayesian Methods and Mixture Models · Data Management and Algorithms