Efficient Device Scheduling with Multi-Job Federated Learning
Chendi Zhou, Ji Liu, Juncheng Jia, Jingbo Zhou, Yang Zhou, Huaiyu Dai,, Dejing Dou

TL;DR
This paper introduces a multi-job federated learning framework with novel scheduling methods that significantly reduce training time and improve accuracy across multiple datasets and jobs.
Contribution
It proposes a new parallel training framework and two innovative scheduling algorithms based on reinforcement learning and Bayesian optimization for multi-job FL.
Findings
Training time reduced by up to 8.67 times
Model accuracy increased by up to 44.6%
Effective scheduling improves multi-job FL efficiency
Abstract
Recent years have witnessed a large amount of decentralized data in multiple (edge) devices of end-users, while the aggregation of the decentralized data remains difficult for machine learning jobs due to laws or regulations. Federated Learning (FL) emerges as an effective approach to handling decentralized data without sharing the sensitive raw data, while collaboratively training global machine learning models. The servers in FL need to select (and schedule) devices during the training process. However, the scheduling of devices for multiple jobs with FL remains a critical and open problem. In this paper, we propose a novel multi-job FL framework to enable the parallel training process of multiple jobs. The framework consists of a system model and two scheduling methods. In the system model, we propose a parallel training process of multiple jobs, and construct a cost model based on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · IoT and Edge/Fog Computing · Age of Information Optimization
