GPT-FL: Generative Pre-trained Model-Assisted Federated Learning
Tuo Zhang, Tiantian Feng, Samiul Alam, Dimitrios Dimitriadis, Sunwoo, Lee, Mi Zhang, Shrikanth S. Narayanan, Salman Avestimehr

TL;DR
GPT-FL introduces a federated learning framework that utilizes generative pre-trained models to produce synthetic data, enhancing model accuracy, communication efficiency, and convergence speed across various data modalities.
Contribution
This work presents a novel GPT-FL framework that leverages generative pre-trained models to improve federated learning performance with synthetic data training.
Findings
GPT-FL outperforms state-of-the-art FL methods in accuracy and efficiency.
Synthetic data generated by GPT-FL enhances convergence speed.
GPT-FL achieves significant gains regardless of data domain alignment.
Abstract
In this work, we propose GPT-FL, a generative pre-trained model-assisted federated learning (FL) framework. At its core, GPT-FL leverages generative pre-trained models to generate diversified synthetic data. These generated data are used to train a downstream model on the server, which is then fine-tuned with private client data under the standard FL framework. We show that GPT-FL consistently outperforms state-of-the-art FL methods in terms of model test accuracy, communication efficiency, and client sampling efficiency. Through comprehensive ablation analysis across various data modalities, we discover that the downstream model generated by synthetic data plays a crucial role in controlling the direction of gradient diversity during FL training, which enhances convergence speed and contributes to the notable accuracy boost observed with GPT-FL. Also, regardless of whether the target…
Peer Reviews
Decision·Submitted to ICLR 2024
The writing is mostly clear. The proposed method significantly outperforms state-of-the-art FL methods.
My main concern is that GPT-FL may significantly increase the burden of the central server. Notice that in the standard FL, the server only needs to aggregate parameters received from clients. However, GPT-FL needs to run pre-trained models on the central server, which requires additional computational cost.
The paper conducts extensive experiments to evaluate the performance of the proposed algorithm.
I do not agree with authors that the proposed framework is a federated learning method. To train the global model at the server, the server does not need the collaboration of clients. In fact, participation of a client does not provide any benefit for others. In this case, federation is not needed. Therefore, clients themselves can use a generative model to generate synthetic data and then train a model using the synthetic data locally. After that client can fine-tune the model. The only reason
I think this is a very interesting and novel approach to generally augment predictive models (not just in FL context). The paper is clearly written and is very well motivated. Experiments are well thought out, thought provoking and yield promising results.
Regarding soundness, the motivation and the proposed method are very straight forward. I do not have any problem with the technical approach. However, I think there should be some extra empirical study to better understand when the proposed method will and will not work. Below are several suggestions, which are not necessarily the weaknesses of this paper (although would be interesting if addressed/investigated). GPT-FL seems to do very well on standard vision benchmark. How do we know that th
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Machine Learning in Healthcare
MethodsTest · SPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
