Towards General-purpose Infrastructure for Protecting Scientific Data Under Study
Andrew Trask, Kritika Prakash

TL;DR
This paper introduces a comprehensive privacy-preserving infrastructure enabling scientists to experiment with sensitive data securely, automatically preventing privacy breaches while supporting familiar tools and workflows.
Contribution
It presents the first integrated system combining privacy techniques into an end-to-end platform for safe scientific data analysis.
Findings
Prototype implemented within the Syft platform using PyTorch
System effectively prevents privacy leakage during experiments
Supports privacy-layman users with familiar tools
Abstract
The scientific method presents a key challenge to privacy because it requires many samples to support a claim. When samples are commercially valuable or privacy-sensitive enough, their owners have strong reasons to avoid releasing them for scientific study. Privacy techniques seek to mitigate this tension by enforcing limits on one's ability to use studied samples for secondary purposes. Recent work has begun combining these techniques into end-to-end systems for protecting data. In this work, we assemble the first such combination which is sufficient for a privacy-layman to use familiar tools to experiment over private data while the infrastructure automatically prohibits privacy leakage. We support this theoretical system with a prototype within the Syft privacy platform using the PyTorch framework.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Scientific Computing and Data Management · Cryptography and Data Security
