Distillation-Based Semi-Supervised Federated Learning for   Communication-Efficient Collaborative Training with Non-IID Private Data

Sohei Itahara; Takayuki Nishio; Yusuke Koda; Masahiro Morikura and; Koji Yamamoto

arXiv:2008.06180·cs.DC·January 7, 2022

Distillation-Based Semi-Supervised Federated Learning for Communication-Efficient Collaborative Training with Non-IID Private Data

Sohei Itahara, Takayuki Nishio, Yusuke Koda, Masahiro Morikura and, Koji Yamamoto

PDF

TL;DR

This paper introduces a distillation-based semi-supervised federated learning framework that significantly reduces communication costs by exchanging model outputs instead of parameters, while maintaining high accuracy on non-IID data.

Contribution

The paper proposes a novel DS-FL algorithm that leverages unlabeled data and output-based communication to improve efficiency and performance in federated learning.

Findings

01

Reduces communication costs by up to 99%

02

Achieves comparable or higher accuracy than traditional FL

03

Effectively handles non-IID data heterogeneity

Abstract

This study develops a federated learning (FL) framework overcoming largely incremental communication costs due to model sizes in typical frameworks without compromising model performance. To this end, based on the idea of leveraging an unlabeled open dataset, we propose a distillation-based semi-supervised FL (DS-FL) algorithm that exchanges the outputs of local models among mobile devices, instead of model parameter exchange employed by the typical frameworks. In DS-FL, the communication cost depends only on the output dimensions of the models and does not scale up according to the model size. The exchanged model outputs are used to label each sample of the open dataset, which creates an additionally labeled dataset. Based on the new dataset, local models are further trained, and model performance is enhanced owing to the data augmentation effect. We further highlight that in DS-FL,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.