Quantifying Privacy Risks of Public Statistics to Residents of Subsidized Housing
Ryan Steed, Diana Qing, Zhiwei Steven Wu

TL;DR
This paper investigates privacy risks in public housing data, showing that certain disclosure avoidance methods may still allow identification of households violating occupancy rules, which raises concerns for data privacy policies.
Contribution
It demonstrates a simple attack method to identify subsidized households in violation of occupancy guidelines and evaluates the effectiveness of different privacy-preserving mechanisms.
Findings
Random swapping does not significantly hinder the attack.
Differential privacy reduces the attack's effectiveness.
The study highlights privacy risks in current data protection methods.
Abstract
As the U.S. Census Bureau implements its controversial new disclosure avoidance system, researchers and policymakers debate the necessity of new privacy protections for public statistics. With experiments on both public statistics and synthetic microdata, we explore a particular privacy concern: respondents in subsidized housing may deliberately not mention unauthorized children and other household members for fear of being discovered and evicted. By combining public statistics from the Decennial Census and the Department of Housing and Urban Development, we demonstrate a simple, inexpensive reconstruction attack that could identify subsidized households living in violation of occupancy guidelines in 2010. Experiments on synthetic data suggest that a random swapping mechanism similar to the Census Bureau's 2010 disclosure avoidance measures does not significantly reduce the precision of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data
