Confederated Machine Learning on Horizontally and Vertically Separated Medical Data for Large-Scale Health System Intelligence
Dianbo Liu, Kathe Fox, Griffin Weber, Tim Miller

TL;DR
This paper introduces a novel confederated machine learning approach that enables training models on health data fragmented across different organizations and data types without requiring patient ID matching, addressing privacy and regulatory challenges.
Contribution
It proposes and evaluates a new confederated learning method capable of handling horizontally and vertically separated health data without patient ID matching.
Findings
Effective risk stratification models developed for multiple diseases.
Method preserves privacy while enabling large-scale health data analysis.
Addresses data fragmentation issues in health systems.
Abstract
Health information is generally fragmented across silos. Though it is technically feasible to unite data for analysis in a manner that underpins a rapid learning healthcare system, privacy concerns and regulatory barriers limit data centralization. Machine learning can be conducted in a federated manner on patient datasets with the same set of variables, but separated across sites of care. But federated learning cannot handle the situation where different data types for a given patient are separated vertically across different organizations and when patient ID matching across different institutions is difficult. We call methods that enable machine learning model training on data separated by two or more degrees confederated machine learning. We proposed and evaluated a confederated learning to training machine learning model to stratify the risk of several diseases among when data are…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Healthcare · Big Data Technologies and Applications · Privacy-Preserving Technologies in Data
