Randomness Concerns When Deploying Differential Privacy
Simson L. Garfinkel, Philip Leclerc

TL;DR
This paper discusses the importance and challenges of generating high-quality randomness for differential privacy in large-scale government data releases, reviewing various methods and proposing a hybrid solution for AWS.
Contribution
It provides a comprehensive review of random number generation methods for differential privacy and proposes a hybrid approach using hardware and software sources in AWS.
Findings
Mersenne Twister unsuitable for production privacy systems
Hardware RNGs like RDRAND and RDRNG are reviewed and evaluated
A hybrid RNG scheme using /dev/urandom and Intel Secure Key is proposed
Abstract
The U.S. Census Bureau is using differential privacy (DP) to protect confidential respondent data collected for the 2020 Decennial Census of Population & Housing. The Census Bureau's DP system is implemented in the Disclosure Avoidance System (DAS) and requires a source of random numbers. We estimate that the 2020 Census will require roughly 90TB of random bytes to protect the person and household tables. Although there are critical differences between cryptography and DP, they have similar requirements for randomness. We review the history of random number generation on deterministic computers, including von Neumann's "middle-square" method, Mersenne Twister (MT19937) (previously the default NumPy random number generator, which we conclude is unacceptable for use in production privacy-preserving systems), and the Linux /dev/urandom device. We also review hardware random number…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
