HeteroFedSyn: Differentially Private Tabular Data Synthesis for Heterogeneous Federated Settings
Xiaochen Li, Fengyu Gao, Xizixiang Wei, Tianhao Wang, Cong Shen, Jing Yang

TL;DR
HeteroFedSyn is a novel differentially private data synthesis framework tailored for heterogeneous federated settings, enabling high-utility synthetic tabular data sharing without compromising privacy.
Contribution
It introduces a federated DP tabular data synthesis method with innovative distributed marginal selection techniques for the first time.
Findings
Achieves utility comparable to centralized synthesis despite federated noise.
Effective in range queries, Wasserstein fidelity, and machine learning tasks.
Introduces new dependency metrics and adaptive strategies for distributed data synthesis.
Abstract
Traditional Differential Privacy (DP) mechanisms are typically tailored to specific analysis tasks, which limits the reusability of protected data. DP tabular data synthesis overcomes this by generating synthetic datasets that can be shared for arbitrary downstream tasks. However, existing synthesis methods predominantly assume centralized or local settings and overlook the more practical horizontal federated scenario. Naively synthesizing data locally or perturbing individual records either produces biased mixtures or introduces excessive noise, especially under heterogeneous data distributions across participants. We propose HeteroFedSyn, the first DP tabular data synthesis framework designed specifically for the horizontal federated setting. Built upon the PrivSyn paradigm of 2-way marginal-based synthesis, HeteroFedSyn introduces three key innovations for distributed marginal…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Stochastic Gradient Optimization Techniques · Mobile Crowdsensing and Crowdsourcing
