High Dimensional Differentially Private Stochastic Optimization with Heavy-tailed Data
Lijie Hu, Shuo Ni, Hanshen Xiao, Di Wang

TL;DR
This paper investigates high-dimensional differentially private stochastic convex optimization with heavy-tailed data, providing new error bounds and algorithms for both regular and sparse learning scenarios under privacy constraints.
Contribution
It introduces the first analysis of DP-SCO with heavy-tailed data in high dimensions, deriving novel error bounds and proposing algorithms for regular and sparse models.
Findings
Error bound of O(rac{\, ext{log}\,d}{(n\, ext{ extperiodcentered}\, ext{ extperiodcentered} ext{1/3}})) for smooth loss functions.
Improved error bound of O(rac{ ext{log}\,d}{(n\, ext{ extperiodcentered}\, ext{ extperiodcentered} ext{2/5}})) for LASSO with bounded fourth moments.
Proposed truncated DP-IHT method achieves error of O(rac{s^{*2} ext{log}\,d}{n\, ext{ extperiodcentered} ext{1}}).
Abstract
As one of the most fundamental problems in machine learning, statistics and differential privacy, Differentially Private Stochastic Convex Optimization (DP-SCO) has been extensively studied in recent years. However, most of the previous work can only handle either regular data distribution or irregular data in the low dimensional space case. To better understand the challenges arising from irregular data distribution, in this paper we provide the first study on the problem of DP-SCO with heavy-tailed data in the high dimensional space. In the first part we focus on the problem over some polytope constraint (such as the -norm ball). We show that if the loss function is smooth and its gradient has bounded second order moment, it is possible to get a (high probability) error bound (excess population risk) of in the -DP…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Stochastic Gradient Optimization Techniques · Complexity and Algorithms in Graphs
