DP-TBART: A Transformer-based Autoregressive Model for Differentially Private Tabular Data Generation
Rodrigo Castellon, Achintya Gopal, Brian Bloniarz, David Rosenberg

TL;DR
DP-TBART is a transformer-based autoregressive model that generates differentially private synthetic tabular data, outperforming traditional methods in certain cases and providing a theoretical understanding of their limitations.
Contribution
Introduces DP-TBART, a novel deep learning-based approach for differentially private tabular data generation, with a theoretical framework comparing it to marginal-based methods.
Findings
DP-TBART achieves competitive performance with marginal-based methods.
In some settings, DP-TBART outperforms state-of-the-art approaches.
Provides a theoretical analysis of the limitations of marginal-based methods.
Abstract
The generation of synthetic tabular data that preserves differential privacy is a problem of growing importance. While traditional marginal-based methods have achieved impressive results, recent work has shown that deep learning-based approaches tend to lag behind. In this work, we present Differentially-Private TaBular AutoRegressive Transformer (DP-TBART), a transformer-based autoregressive model that maintains differential privacy and achieves performance competitive with marginal-based methods on a wide variety of datasets, capable of even outperforming state-of-the-art methods in certain settings. We also provide a theoretical framework for understanding the limitations of marginal-based approaches and where deep learning-based approaches stand to contribute most. These results suggest that deep learning-based techniques should be considered as a viable alternative to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Vehicular Ad Hoc Networks (VANETs)
MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Label Smoothing · Softmax · Dense Connections · Dropout · Byte Pair Encoding · Position-Wise Feed-Forward Layer · Residual Connection
