
TL;DR
This paper develops differentially private methods for linear regression analysis, enabling the construction of confidence intervals and hypothesis testing while preserving data privacy, using techniques like Johnson-Lindenstrauss Transform and Ridge regression.
Contribution
It introduces differentially private confidence intervals for linear regression using JLT and analyzes their effectiveness for well-spread data and regularized models.
Findings
JLT provides accurate approximation of t-values for well-spread data
Confidence intervals can be derived from projected data in Ridge regression
Analyze Gauss algorithm can produce valid confidence intervals under certain conditions
Abstract
Linear regression is one of the most prevalent techniques in machine learning, however, it is also common to use linear regression for its \emph{explanatory} capabilities rather than label prediction. Ordinary Least Squares (OLS) is often used in statistics to establish a correlation between an attribute (e.g. gender) and a label (e.g. income) in the presence of other (potentially correlated) features. OLS assumes a particular model that randomly generates the data, and derives \emph{-values} --- representing the likelihood of each real value to be the true correlation. Using -values, OLS can release a \emph{confidence interval}, which is an interval on the reals that is likely to contain the true correlation, and when this interval does not intersect the origin, we can \emph{reject the null hypothesis} as it is likely that the true correlation is non-zero. Our work aims at…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsLinear Regression
