PP-GWAS: Privacy Preserving Multi-Site Genome-wide Association Studies
Arjhun Swaminathan, Anika Hannemann, Ali Burak \"Unal, Nico Pfeifer, Mete Akg\"un

TL;DR
PP-GWAS introduces a privacy-preserving, efficient algorithm for multi-site genome-wide association studies that significantly reduces computation time and resource usage while maintaining strong data privacy protections.
Contribution
It presents a novel randomized encoding algorithm for distributed ridge regression in GWAS, improving efficiency and scalability over existing privacy-preserving methods.
Findings
Achieves twice the speed of current algorithms
Uses less computational resources
Maintains robust security against semi-honest adversaries
Abstract
Genome-wide association studies are pivotal in understanding the genetic underpinnings of complex traits and diseases. Collaborative, multi-site GWAS aim to enhance statistical power but face obstacles due to the sensitive nature of genomic data sharing. Current state-of-the-art methods provide a privacy-focused approach utilizing computationally expensive methods such as Secure Multi-Party Computation and Homomorphic Encryption. In this context, we present a novel algorithm PP-GWAS designed to improve upon existing standards in terms of computational efficiency and scalability without sacrificing data privacy. This algorithm employs randomized encoding within a distributed architecture to perform stacked ridge regression on a Linear Mixed Model to ensure rigorous analysis. Experimental evaluation with real world and synthetic data indicates that PP-GWAS can achieve computational speeds…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEthics in Clinical Research · Reproductive Health and Technologies
