dpmm: Differentially Private Marginal Models, a Library for Synthetic Tabular Data Generation
Sofiane Mahiou, Amir Dizche, Reza Nazari, Xinmin Wu, Ralph Abbey, Jorge Silva, Georgi Ganev

TL;DR
dpmm is an open-source library that generates synthetic tabular data with differential privacy guarantees, integrating popular marginal models and best practices for robustness and utility.
Contribution
It introduces a comprehensive, customizable library combining multiple marginal models with end-to-end differential privacy guarantees for synthetic data generation.
Findings
Achieves superior utility compared to existing methods
Provides robust, customizable implementations of PrivBayes, MST, and AIM
Addresses known DP vulnerabilities effectively
Abstract
We propose dpmm, an open-source library for synthetic data generation with Differentially Private (DP) guarantees. It includes three popular marginal models -- PrivBayes, MST, and AIM -- that achieve superior utility and offer richer functionality compared to alternative implementations. Additionally, we adopt best practices to provide end-to-end DP guarantees and address well-known DP-related vulnerabilities. Our goal is to accommodate a wide audience with easy-to-install, highly customizable, and robust model implementations. Our codebase is available from https://github.com/sassoftware/dpmm.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · 3D Modeling in Geospatial Applications
MethodsADaptive gradient method with the OPTimal convergence rate · Lib
