SIMD-PAC-DB: Pretty Performant PAC Privacy
Ilaria Battiston, Dandan Yuan, Xiaochen Zhu, Peter Boncz

TL;DR
This paper introduces SIMD-PAC-DB, an optimized, SIMD-friendly implementation of PAC-DB that reduces query complexity and accelerates privacy-preserving database queries, making practical privacy-aware data systems more feasible.
Contribution
It presents a single-query PAC-DB implementation with SIMD-optimized algorithms, significantly improving efficiency over previous stochastic methods.
Findings
Up to 40x faster aggregate computations
Single-query privacy guarantees achieved
Enhanced performance on real-world benchmarks
Abstract
This work presents a highly optimized implementation of PAC-DB, a recent and promising database privacy model. We prove that our SIMD-PAC-DB can compute the same privatized answer with just a single query, instead of the 128 stochastic executions against different 50% database sub-samples needed by the original PAC-DB. Our key insight is that every bit of a hashed primary key can be seen to represent membership of such a sub-sample. We present new algorithms for approximate computation of stochastic aggregates based on these hashes, which, thanks to their SIMD-friendliness, run up to 40x faster than scalar equivalents. We release an open-source DuckDB community extension which includes a rewriter that PAC-privatizes arbitrary SQL queries. Our experiments on TPC-H, Clickbench, and SQLStorm evaluate thousands of queries in terms of performance and utility, significantly advancing the ease…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Cryptography and Data Security · Data Management and Algorithms
