TL;DR
This paper introduces a cascaded diffusion model for generating realistic mixed-type tabular data, effectively capturing discrete and continuous features with improved fidelity and a novel guided conditional approach.
Contribution
It proposes a new cascaded flow matching approach that enhances generative modeling of mixed-type tabular data, with formal proofs and superior experimental results.
Findings
Generated samples are more realistic and detailed.
Detection score improves by 51.9%.
Model better captures distributional nuances.
Abstract
Advances in generative modeling have recently been adapted to tabular data containing discrete and continuous features. However, generating mixed-type features that combine discrete states with an otherwise continuous distribution in a single feature remains challenging. We advance the state-of-the-art in diffusion models for tabular data with a cascaded approach. We first generate a low-resolution version of a tabular data row, that is, the collection of the purely categorical features and a coarse categorical representation of numerical features. Next, this information is leveraged in the high-resolution flow matching model via a novel guided conditional probability path and data-dependent coupling. The low-resolution representation of numerical features explicitly accounts for discrete outcomes, such as missing or inflated values, and therewith enables a more faithful generation of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
