The Data Airlock: infrastructure for restricted data informatics
Gregory Rolan, Janis Dalins, and Campbell Wilson

TL;DR
The paper introduces the 'Data Airlock,' a secure infrastructure designed to enable data science on restricted data without compromising privacy or security, addressing legal and ethical challenges.
Contribution
It presents the architecture and implementation of a novel secure infrastructure for restricted data collaboration, including a first single-organization version and future federated development.
Findings
Successful implementation of the Data Airlock in a real-world setting
Identification of key security and privacy challenges in restricted data environments
Insights into the requirements for federated restricted data infrastructure
Abstract
Data science collaboration is problematic when access to operational data or models from outside the data-holding organisation is prohibited, for a variety of legal, security, ethical, or practical reasons. There are significant data privacy challenges when performing collaborative data science work against such restricted data. In this paper we describe a range of causes and risks associated with restricted data along with the social, environmental, data, and cryptographic measures that may be used to mitigate such issues. We then show how these are generally inadequate for restricted data contexts and introduce the 'Data Airlock' - secure infrastructure that facilitates 'eyes-off' data science workloads. After describing our use-case we detail the architecture and implementation of a first, single-organisation version of the Data Airlock infrastructure. We conclude with outcomes and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Scientific Computing and Data Management · Information and Cyber Security
