Best Linear Unbiased Estimate from Privatized Contingency Tables
Jordan Awan, Adam Edwards, Paul Bartholomew, Andrew Sillers

TL;DR
This paper introduces SEA BLUE, a scalable, efficient linear estimator that improves the accuracy of privatized contingency tables by enforcing self-consistency, suitable for large-scale applications like the Census.
Contribution
The paper presents SEA BLUE, a novel two-step algorithm that efficiently computes the best linear unbiased estimate from privatized data, scalable to large datasets.
Findings
SEA BLUE enforces self-consistency with linear unbiasedness.
The method achieves minimum variance under structural assumptions.
Empirical results demonstrate robustness and scalability.
Abstract
In differential privacy (DP) mechanisms, it can be beneficial to release "redundant" outputs, where some quantities can be estimated in multiple ways by combining different privatized values. Indeed, the DP 2020 Decennial Census products published by the U.S. Census Bureau consist of such redundant noisy counts. When redundancy is present, the DP output can be improved by enforcing self-consistency (i.e., estimators obtained using different noisy counts result in the same value), and we show that the minimum variance processing is a linear projection. However, standard projection algorithms require excessive computation and memory, making them impractical for large-scale applications such as the Decennial Census. We propose the Scalable Efficient Algorithm for Best Linear Unbiased Estimate (SEA BLUE), based on a two-step process of aggregation and differencing that 1) enforces…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Statistical Process Monitoring · Advanced Statistical Methods and Models
