A Feasibility Study of Differentially Private Summary Statistics and Regression Analyses with Evaluations on Administrative and Survey Data
Andr\'es F. Barrientos, Aaron R. Williams, Joshua Snoke, and Claire, McKay Bowen

TL;DR
This study evaluates the feasibility of differentially private methods for releasing summary statistics and regression analyses on sensitive administrative and survey data, highlighting their strengths and limitations for policy research.
Contribution
It introduces new adaptations of DP regression methods for complex data types and assesses their practical utility on real datasets, providing the first comprehensive analysis of this kind.
Findings
DP methods work well for simple statistics
DP regression estimates and confidence intervals are less accurate
Data complexity poses challenges for DP methods
Abstract
Federal administrative data, such as tax data, are invaluable for research, but because of privacy concerns, access to these data is typically limited to select agencies and a few individuals. An alternative to sharing microlevel data is to allow individuals to query statistics without directly accessing the confidential data. This paper studies the feasibility of using differentially private (DP) methods to make certain queries while preserving privacy. We also include new methodological adaptations to existing DP regression methods for using new data types and returning standard error estimates. We define feasibility as the impact of DP methods on analyses for making public policy decisions and the queries accuracy according to several utility metrics. We evaluate the methods using Internal Revenue Service data and public-use Current Population Survey data and identify how specific…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEconomic and Environmental Valuation · Survey Methodology and Nonresponse · Privacy-Preserving Technologies in Data
