A Simple Algorithm for Estimating Distribution Parameters from $n$-Dimensional Randomized Binary Responses
Staal A. Vinterbo

TL;DR
This paper introduces a simple, efficient algorithm for estimating distribution parameters from randomized binary responses, enabling privacy-preserving data analysis with theoretical efficiency bounds and practical implementation simplicity.
Contribution
It identifies a family of response randomizers with a specific mathematical structure and provides a straightforward algorithm for unbiased maximum likelihood estimation of parameters.
Findings
The algorithm achieves unbiased estimates of $k$-way marginals.
The method offers theoretical bounds on statistical efficiency.
It discusses the privacy-efficiency tradeoff in randomized responses.
Abstract
Randomized response is attractive for privacy preserving data collection because the provided privacy can be quantified by means such as differential privacy. However, recovering and analyzing statistics involving multiple dependent randomized binary attributes can be difficult, posing a significant barrier to use. In this work, we address this problem by identifying and analyzing a family of response randomizers that change each binary attribute independently with the same probability. Modes of Google's Rappor randomizer as well as applications of two well-known classical randomized response methods, Warner's original method and Simmons' unrelated question method, belong to this family. We show that randomizers in this family transform multinomial distribution parameters by an iterated Kronecker product of an invertible and bisymmetric matrix. This allows us to present a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSurvey Sampling and Estimation Techniques · Privacy-Preserving Technologies in Data · SARS-CoV-2 detection and testing
