S-BDT: Distributed Differentially Private Boosted Decision Trees
Thorsten Peinemann, Moritz Kirschte, Joshua Stock, Carlos Cotrini,, Esfandiar Mohammadi

TL;DR
S-BDT introduces a differentially private distributed GBDT method that reduces noise and improves privacy-utility trade-offs, especially for non-IID data streams, demonstrated on multiple datasets.
Contribution
It proposes a novel $(psilon,elta)$-differentially private GBDT with less noise using non-spherical Gaussian, and provides tight privacy bounds and practical utility improvements.
Findings
Achieves same utility with 50% less epsilon on Abalone dataset.
Saves 30% epsilon on Adult and Spambase datasets.
Further improves privacy savings for non-IID data streams.
Abstract
We introduce S-BDT: a novel -differentially private distributed gradient boosted decision tree (GBDT) learner that improves the protection of single training data points (privacy) while achieving meaningful learning goals, such as accuracy or regression error (utility). S-BDT uses less noise by relying on non-spherical multivariate Gaussian noise, for which we show tight subsampling bounds for privacy amplification and incorporate that into a R\'enyi filter for individual privacy accounting. We experimentally reach the same utility while saving in terms of epsilon for on the Abalone regression dataset (dataset size ), saving in terms of epsilon for for the Adult classification dataset (dataset size ), and saving in terms of epsilon for for the Spambase…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data
