Asynchronous Collaborative Learning Across Data Silos

Tiffany Tuor; Joshua Lockhart; Daniele Magazzeni

arXiv:2203.12637·cs.LG·March 25, 2022

Asynchronous Collaborative Learning Across Data Silos

Tiffany Tuor, Joshua Lockhart, Daniele Magazzeni

PDF

Open Access

TL;DR

This paper introduces an asynchronous collaborative training framework for machine learning across data silos within organizations, enabling model training without data sharing, especially useful in regulated industries.

Contribution

It extends federated learning to support asynchronous intra-organization, cross-silo model training, addressing data fragmentation challenges.

Findings

01

Effective in enabling collaborative training without data sharing

02

Improves upon traditional federated learning for asynchronous settings

03

Validated through extensive experiments

Abstract

Machine learning algorithms can perform well when trained on large datasets. While large organisations often have considerable data assets, it can be difficult for these assets to be unified in a manner that makes training possible. Data is very often 'siloed' in different parts of the organisation, with little to no access between silos. This fragmentation of data assets is especially prevalent in heavily regulated industries like financial services or healthcare. In this paper we propose a framework to enable asynchronous collaborative training of machine learning models across data silos. This allows data science teams to collaboratively train a machine learning model, without sharing data with one another. Our proposed approach enhances conventional federated learning techniques to make them suitable for this asynchronous training in this intra-organisation, cross-silo setting. We…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsData Stream Mining Techniques · Big Data and Business Intelligence · Data Quality and Management