A Game-Theoretic Approach to Privacy-Utility Tradeoff in Sharing Genomic Summary Statistics
Tao Zhang, Rajagopal Venkatesaramani, Rajat K. De, Bradley A. Malin,, Yevgeniy Vorobeychik

TL;DR
This paper introduces a Bayesian game-theoretic framework for optimizing privacy-utility tradeoffs in sharing genomic summary statistics, demonstrating it outperforms traditional methods against powerful adaptive attacks.
Contribution
It develops a comprehensive Bayesian attacker model, compares it with existing LRT-based models, and proposes neural network-based equilibrium approximations for enhanced privacy-utility balancing.
Findings
Bayesian attacker model is more powerful than LRT-based models.
Proposed framework yields stronger attacks and defenses.
Neural network approach effectively approximates game equilibria.
Abstract
The advent of online genomic data-sharing services has sought to enhance the accessibility of large genomic datasets by allowing queries about genetic variants, such as summary statistics, aiding care providers in distinguishing between spurious genomic variations and those with clinical significance. However, numerous studies have demonstrated that even sharing summary genomic information exposes individual members of such datasets to a significant privacy risk due to membership inference attacks. While several approaches have emerged that reduce privacy risks by adding noise or reducing the amount of information shared, these typically assume non-adaptive attacks that use likelihood ratio test (LRT) statistics. We propose a Bayesian game-theoretic framework for optimal privacy-utility tradeoff in the sharing of genomic summary statistics. Our first contribution is to prove that a very…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Advanced Causal Inference Techniques · Genetic Associations and Epidemiology
