TL;DR
This paper introduces a novel diffusion model with transformer-based conditioning and dynamic masking for effective tabular data imputation and synthetic data generation, outperforming existing methods in efficiency, similarity, and privacy.
Contribution
It presents a new diffusion model architecture with attention, transformer, and masking enhancements tailored for tabular data tasks, unifying imputation and generation.
Findings
Outperforms state-of-the-art models like VAE, GAN, and existing diffusion models.
Effectively handles missing data imputation with high efficiency.
Generates statistically similar data with reduced privacy risks.
Abstract
Data imputation and data generation have important applications for many domains, like healthcare and finance, where incomplete or missing data can hinder accurate analysis and decision-making. Diffusion models have emerged as powerful generative models capable of capturing complex data distributions across various data modalities such as image, audio, and time series data. Recently, they have been also adapted to generate tabular data. In this paper, we propose a diffusion model for tabular data that introduces three key enhancements: (1) a conditioning attention mechanism, (2) an encoder-decoder transformer as the denoising network, and (3) dynamic masking. The conditioning attention mechanism is designed to improve the model's ability to capture the relationship between the condition and synthetic data. The transformer layers help model interactions within the condition (encoder) or…
Peer Reviews
Decision·Submitted to ICLR 2024
Overall: The paper is easy to read and the contribution is simple but effective. The experiments cover a wide range of datasets though not algorithms. Pros: (i) The paper extends TabDDPM to TabGenDDPM utilizing the transformer architecture which has been wildly succesful in other generative settings. The experiments confirm the benefits of the proposed approach. The additional benefit of covering both imputation and generation in the same framework enables a wide range of usecases in real-worl
Cons: (a) Some of the other competing methods like AIM, CTAB-GAN+ and others are not compared in the paper. (b) The number of features in the datasets are few. HELOC has the highest with only 21 features and it is unclear how this framework performs when the feature set is large.
The experimental comparisons are good. The author conducts TabGenDDPM on eight datasets under three evaluation criteria.
1. The overall contribution of this paper is limited. All of the content except the transformer conditioning architecture is already known. The architecture design is heuristic, which has no theoretical guarantees of the performance. Moreover, they build upon Variance Preserving (VP) SDE (e.g., DDPM or TabDDPM in tabular data). The author does not mention wether their method work for Variance Exploding (VE) SDE (e.g, Score-based generative model, StaSy [1] in tabular data). [1]: Kim, J., Lee,
- The proposed architecture is a natural improvement from TabDDPM and according to the experiments, it seems to really improve the model in term of ML-efficacy - The paper is clear and well written with several illustrations - The privacy risk is considered
- The proposed architecture is mostly a derivative work from TabDDPM - The proposed diffusion algorithms are a bit outdated now, especially on the discrete side since works like: Austin et al. "Structured Denoising Diffusion Models in Discrete State-Spaces" NeurIPS 2021, or Campbell et al. "A Continuous Time Framework for Discrete Denoising Models" NeurIPS 2022. It is worth noting that "mask" systems are also studied in (Austin et al. 2021). - No ablation study to validate the separately differe
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsSoftmax · Attention Is All You Need · Diffusion
