GPT-FL: Generative Pre-trained Model-Assisted Federated Learning

Tuo Zhang; Tiantian Feng; Samiul Alam; Dimitrios Dimitriadis; Sunwoo; Lee; Mi Zhang; Shrikanth S. Narayanan; Salman Avestimehr

arXiv:2306.02210·cs.LG·June 19, 2024·2 cites

GPT-FL: Generative Pre-trained Model-Assisted Federated Learning

Tuo Zhang, Tiantian Feng, Samiul Alam, Dimitrios Dimitriadis, Sunwoo, Lee, Mi Zhang, Shrikanth S. Narayanan, Salman Avestimehr

PDF

Open Access 1 Repo 3 Reviews

TL;DR

GPT-FL introduces a federated learning framework that utilizes generative pre-trained models to produce synthetic data, enhancing model accuracy, communication efficiency, and convergence speed across various data modalities.

Contribution

This work presents a novel GPT-FL framework that leverages generative pre-trained models to improve federated learning performance with synthetic data training.

Findings

01

GPT-FL outperforms state-of-the-art FL methods in accuracy and efficiency.

02

Synthetic data generated by GPT-FL enhances convergence speed.

03

GPT-FL achieves significant gains regardless of data domain alignment.

Abstract

In this work, we propose GPT-FL, a generative pre-trained model-assisted federated learning (FL) framework. At its core, GPT-FL leverages generative pre-trained models to generate diversified synthetic data. These generated data are used to train a downstream model on the server, which is then fine-tuned with private client data under the standard FL framework. We show that GPT-FL consistently outperforms state-of-the-art FL methods in terms of model test accuracy, communication efficiency, and client sampling efficiency. Through comprehensive ablation analysis across various data modalities, we discover that the downstream model generated by synthetic data plays a crucial role in controlling the direction of gradient diversity during FL training, which enhances convergence speed and contributes to the notable accuracy boost observed with GPT-FL. Also, regardless of whether the target…

Peer Reviews

Decision·Submitted to ICLR 2024

Reviewer 01Rating 5· marginally below the acceptance thresholdConfidence 3

Strengths

The writing is mostly clear. The proposed method significantly outperforms state-of-the-art FL methods.

Weaknesses

My main concern is that GPT-FL may significantly increase the burden of the central server. Notice that in the standard FL, the server only needs to aggregate parameters received from clients. However, GPT-FL needs to run pre-trained models on the central server, which requires additional computational cost.

Reviewer 02Rating 3· reject, not good enoughConfidence 4

Strengths

The paper conducts extensive experiments to evaluate the performance of the proposed algorithm.

Weaknesses

I do not agree with authors that the proposed framework is a federated learning method. To train the global model at the server, the server does not need the collaboration of clients. In fact, participation of a client does not provide any benefit for others. In this case, federation is not needed. Therefore, clients themselves can use a generative model to generate synthetic data and then train a model using the synthetic data locally. After that client can fine-tune the model. The only reason

Reviewer 03Rating 6· marginally above the acceptance thresholdConfidence 4

Strengths

I think this is a very interesting and novel approach to generally augment predictive models (not just in FL context). The paper is clearly written and is very well motivated. Experiments are well thought out, thought provoking and yield promising results.

Weaknesses

Regarding soundness, the motivation and the proposed method are very straight forward. I do not have any problem with the technical approach. However, I think there should be some extra empirical study to better understand when the proposed method will and will not work. Below are several suggestions, which are not necessarily the weaknesses of this paper (although would be interesting if addressed/investigated). GPT-FL seems to do very well on standard vision benchmark. How do we know that th

Code & Models

Repositories

AvestimehrResearchGroup/GPT-FL
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPrivacy-Preserving Technologies in Data · Machine Learning in Healthcare

MethodsTest · SPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings