Sharing Knowledge without Sharing Data: Stitches can improve ensembles of disjointly trained models
Arthur Guijt, Dirk Thierens, Ellen Kerkhof, Jan Wiersma, Tanja Alderliesten, Peter A.N. Bosman

TL;DR
This paper explores how stitching intermediate representations in independently trained models can enhance ensemble performance in data-scarce, fragmented settings like healthcare, without requiring synchronized training or data sharing.
Contribution
It introduces a stitching method for combining models trained separately, improving ensemble performance and generalization in asynchronous collaborative scenarios.
Findings
Stitching models recovers performance close to joint training.
Ensembles of individually trained models generalize better.
Stitching enables asynchronous collaboration without data sharing.
Abstract
Deep learning has been shown to be very capable at performing many real-world tasks. However, this performance is often dependent on the presence of large and varied datasets. In some settings, like in the medical domain, data is often fragmented across parties, and cannot be readily shared. While federated learning addresses this situation, it is a solution that requires synchronicity of parties training a single model together, exchanging information about model weights. We investigate how asynchronous collaboration, where only already trained models are shared (e.g. as part of a publication), affects performance, and propose to use stitching as a method for combining models. Through taking a multi-objective perspective, where performance on each parties' data is viewed independently, we find that training solely on a single parties' data results in similar performance when merging…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Advanced Neural Network Applications · Mobile Crowdsensing and Crowdsourcing
