On Differentially Private Subspace Estimation in a Distribution-Free Setting
Eliad Tsfadia

TL;DR
This paper introduces measures based on singular-value gaps to identify low-dimensional subspaces privately in high-dimensional data, reducing the number of points needed and providing practical algorithms with improved performance.
Contribution
It defines new measures of dataset 'easiness' for subspace estimation based on singular-value gaps and establishes bounds that enable dimension-independent private subspace estimation.
Findings
Singular-value gaps determine when subspace estimation is dimension-independent.
New upper and lower bounds for private subspace estimation based on dataset structure.
Practical algorithm demonstrating improved high-dimensional performance.
Abstract
Private data analysis faces a significant challenge known as the curse of dimensionality, leading to increased costs. However, many datasets possess an inherent low-dimensional structure. For instance, during optimization via gradient descent, the gradients frequently reside near a low-dimensional subspace. If the low-dimensional structure could be privately identified using a small amount of points, we could avoid paying for the high ambient dimension. On the negative side, Dwork, Talwar, Thakurta, and Zhang (STOC 2014) proved that privately estimating subspaces, in general, requires an amount of points that has a polynomial dependency on the dimension. However, their bounds do not rule out the possibility to reduce the number of points for "easy" instances. Yet, providing a measure that captures how much a given dataset is "easy" for this task turns out to be challenging, and was…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsAdvanced Statistical Process Monitoring · Statistical Distribution Estimation and Applications
