S-BDT: Distributed Differentially Private Boosted Decision Trees

Thorsten Peinemann; Moritz Kirschte; Joshua Stock; Carlos Cotrini,; Esfandiar Mohammadi

arXiv:2309.12041·cs.CR·August 19, 2024

S-BDT: Distributed Differentially Private Boosted Decision Trees

Thorsten Peinemann, Moritz Kirschte, Joshua Stock, Carlos Cotrini,, Esfandiar Mohammadi

PDF

Open Access 1 Repo

TL;DR

S-BDT introduces a differentially private distributed GBDT method that reduces noise and improves privacy-utility trade-offs, especially for non-IID data streams, demonstrated on multiple datasets.

Contribution

It proposes a novel $(psilon,elta)$-differentially private GBDT with less noise using non-spherical Gaussian, and provides tight privacy bounds and practical utility improvements.

Findings

01

Achieves same utility with 50% less epsilon on Abalone dataset.

02

Saves 30% epsilon on Adult and Spambase datasets.

03

Further improves privacy savings for non-IID data streams.

Abstract

We introduce S-BDT: a novel $(ε, δ)$ -differentially private distributed gradient boosted decision tree (GBDT) learner that improves the protection of single training data points (privacy) while achieving meaningful learning goals, such as accuracy or regression error (utility). S-BDT uses less noise by relying on non-spherical multivariate Gaussian noise, for which we show tight subsampling bounds for privacy amplification and incorporate that into a R\'enyi filter for individual privacy accounting. We experimentally reach the same utility while saving $50%$ in terms of epsilon for $ε \leq 0.5$ on the Abalone regression dataset (dataset size $\approx 4 K$ ), saving $30%$ in terms of epsilon for $ε \leq 0.08$ for the Adult classification dataset (dataset size $\approx 50 K$ ), and saving $30%$ in terms of epsilon for $ε \leq 0.03$ for the Spambase…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

kirschte/sbdt
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPrivacy-Preserving Technologies in Data