Towards Utilizing Unlabeled Data in Federated Learning: A Survey and   Prospective

Yilun Jin; Xiguang Wei; Yang Liu; Qiang Yang

arXiv:2002.11545·cs.LG·May 12, 2020·58 cites

Towards Utilizing Unlabeled Data in Federated Learning: A Survey and Prospective

Yilun Jin, Xiguang Wei, Yang Liu, Qiang Yang

PDF

Open Access

TL;DR

This paper surveys the potential of leveraging unlabeled data in federated learning to address data labeling costs and enhance model performance, highlighting a largely unexplored research area.

Contribution

It identifies the importance of using unlabeled data in federated learning and reviews possible research directions to advance this field.

Findings

01

Unlabeled data can significantly reduce labeling costs in FL.

02

Few existing works focus on utilizing unlabeled data in FL.

03

Survey highlights promising research areas for unlabeled data in FL.

Abstract

Federated Learning (FL) proposed in recent years has received significant attention from researchers in that it can bring separate data sources together and build machine learning models in a collaborative but private manner. Yet, in most applications of FL, such as keyboard prediction, labeling data requires virtually no additional efforts, which is not generally the case. In reality, acquiring large-scale labeled datasets can be extremely costly, which motivates research works that exploit unlabeled data to help build machine learning models. However, to the best of our knowledge, few existing works aim to utilize unlabeled data to enhance federated learning, which leaves a potentially promising research topic. In this paper, we identify the need to exploit unlabeled data in FL, and survey possible research fields that can contribute to the goal.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPrivacy-Preserving Technologies in Data · Machine Learning and Data Classification · Cryptography and Data Security