Differentially Private Sparse Linear Regression with Heavy-tailed Responses

Xizhi Tian; Meng Ding; Touming Tao; Zihang Xiang; Di Wang

arXiv:2506.06861·cs.LG·June 10, 2025

Differentially Private Sparse Linear Regression with Heavy-tailed Responses

Xizhi Tian, Meng Ding, Touming Tao, Zihang Xiang, Di Wang

PDF

Open Access

TL;DR

This paper develops differentially private methods for high-dimensional sparse linear regression with heavy-tailed responses, achieving improved error bounds and demonstrating superior performance over existing algorithms on synthetic and real data.

Contribution

It introduces two novel DP algorithms, DP-IHT-H and DP-IHT-L, tailored for heavy-tailed data in high dimensions, with theoretical error bounds and empirical validation.

Findings

01

DP-IHT-H achieves error bounds depending on tail heaviness and sample size.

02

DP-IHT-L further improves error bounds, independent of tail heaviness.

03

Experiments show our methods outperform standard DP algorithms on real datasets.

Abstract

As a fundamental problem in machine learning and differential privacy (DP), DP linear regression has been extensively studied. However, most existing methods focus primarily on either regular data distributions or low-dimensional cases with irregular data. To address these limitations, this paper provides a comprehensive study of DP sparse linear regression with heavy-tailed responses in high-dimensional settings. In the first part, we introduce the DP-IHT-H method, which leverages the Huber loss and private iterative hard thresholding to achieve an estimation error bound of $ \tilde{O}\biggl( s^{* \frac{1 }{2}} \cdot \biggl(\frac{\log d}{n}\biggr)^{\frac{\zeta}{1 + \zeta}} + s^{* \frac{1 + 2\zeta}{2 + 2\zeta}} \cdot \biggl(\frac{\log^2 d}{n \varepsilon}\biggr)^{\frac{\zeta}{1 + \zeta}} \biggr) $ under the $(ε, δ)$ -DP model, where $n$ is the sample size,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPrivacy-Preserving Technologies in Data · Stochastic Gradient Optimization Techniques · Cryptography and Data Security

MethodsLinear Regression · Focus · Huber loss