Bringing the Algorithms to the Data -- Secure Distributed Medical Analytics using the Personal Health Train (PHT-meDIC)
Marius de Arruda Botelho Herr, Michael Graf, Peter Placzek, Florian, K\"onig, Felix B\"otte, Tyra Stickel, David Hieber, Lukas Zimmermann, Michael, Slupina, Christopher Mohr, Stephanie Biergans, Mete Akg\"un, Nico Pfeifer,, Oliver Kohlbacher

TL;DR
The paper presents PHT-meDIC, an open-source system implementing the Personal Health Train paradigm to enable secure, privacy-preserving distributed medical data analysis across multiple sites using containerization.
Contribution
It introduces PHT-meDIC, a scalable, secure platform for distributed medical analytics that adheres to data privacy regulations and facilitates complex analysis pipelines.
Findings
Successful deployment on large-scale distributed data
Application of deep neural networks to medical images
Enhanced security and governance in distributed analysis
Abstract
The need for data privacy and security -- enforced through increasingly strict data protection regulations -- renders the use of healthcare data for machine learning difficult. In particular, the transfer of data between different hospitals is often not permissible and thus cross-site pooling of data not an option. The Personal Health Train (PHT) paradigm proposed within the GO-FAIR initiative implements an 'algorithm to the data' paradigm that ensures that distributed data can be accessed for analysis without transferring any sensitive data. We present PHT-meDIC, a productively deployed open-source implementation of the PHT concept. Containerization allows us to easily deploy even complex data analysis pipelines (e.g, genomics, image analysis) across multiple sites in a secure and scalable manner. We discuss the underlying technological concepts, security models, and governance…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsScientific Computing and Data Management · Artificial Intelligence in Healthcare and Education · Privacy-Preserving Technologies in Data
