Optimizing error of high-dimensional statistical queries under   differential privacy

Ryan McKenna; Gerome Miklau; Michael Hay; Ashwin Machanavajjhala

arXiv:1808.03537·cs.DB·August 13, 2018

Optimizing error of high-dimensional statistical queries under differential privacy

Ryan McKenna, Gerome Miklau, Michael Hay, Ashwin Machanavajjhala

PDF

TL;DR

This paper introduces HDMM, a novel differentially private algorithm that efficiently answers high-dimensional predicate counting queries with improved accuracy over existing methods, enabling safer data release for complex datasets.

Contribution

HDMM leverages an implicit matrix representation of query workloads to efficiently optimize differentially private algorithms for high-dimensional data.

Findings

01

HDMM achieves lower error than state-of-the-art methods on various datasets.

02

It is especially effective for high-dimensional datasets.

03

The algorithm is computationally efficient for complex query workloads.

Abstract

Differentially private algorithms for answering sets of predicate counting queries on a sensitive database have many applications. Organizations that collect individual-level data, such as statistical agencies and medical institutions, use them to safely release summary tabulations. However, existing techniques are accurate only on a narrow class of query workloads, or are extremely slow, especially when analyzing more than one or two dimensions of the data. In this work we propose HDMM, a new differentially private algorithm for answering a workload of predicate counting queries, that is especially effective for higher-dimensional datasets. HDMM represents query workloads using an implicit matrix representation and exploits this compact representation to efficiently search (a subset of) the space of differentially private algorithms for one that answers the input query workload with…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.