Federated Generative Learning with Foundation Models
Jie Zhang, Xiaohua Qi, Bo Zhao

TL;DR
This paper introduces Federated Generative Learning, a novel framework leveraging foundation models to synthesize training data remotely, improving communication efficiency, privacy, and performance in federated learning scenarios.
Contribution
The paper proposes a new federated learning framework that uses foundation generative models and local text embeddings to synthesize training data remotely, addressing efficiency and privacy issues.
Findings
Outperforms FedAvg by 12% on ImageNet100 in a single round
Enhances privacy and robustness to data heterogeneity
Validates benefits across 12 datasets
Abstract
Existing approaches in Federated Learning (FL) mainly focus on sending model parameters or gradients from clients to a server. However, these methods are plagued by significant inefficiency, privacy, and security concerns. Thanks to the emerging foundation generative models, we propose a novel federated learning framework, namely Federated Generative Learning. In this framework, each client can create text embeddings that are tailored to their local data, and send embeddings to the server. Then the informative training data can be synthesized remotely on the server using foundation generative models with these embeddings, which can benefit FL tasks. Our proposed framework offers several advantages, including increased communication efficiency, robustness to data heterogeneity, substantial performance improvements, and enhanced privacy protection. We validate these benefits through…
Peer Reviews
Decision·Submitted to ICLR 2024
- This work proposes a novel learning framework to train local data without accessing the raw data directly. - communication of prompts instead of model parameters addresses several issues of existing federated learning frameworks; high communication cost and potential privacy threats by attackers.
- The proposed method may be highly dependent on the performance of both diffusion models and visual-captioning models. - An ablation study of varying the foundation models is needed. - In a similar vein, the local training dataset should be unseen for pertaining foundation models and should be more difficult than ImageNet which is a standard image classification dataset. As mentioned in the Introduction section, the local training data are more likely to be privacy sensitive, so they are mo
- The proposed approach is interesting and novel to my understanding. Assuming the client data distributions can be well captured by the foundation generative model, the proposed technique can clear benefits in simplicity and reducing communication costs. - Putting aside the underlying assumptions of the proposed techniques (see weaknesses), the paper is overall well-executed in terms of the diversity of the experiments and visualizations. - The paper is generally well-written and easy-to-follow
[W1] The main weakness of the proposed method is the underlying assumption that client data can, in fact, be generated by foundational models. This sound obvious but is key to the applicability of the proposed approach in practice. To put it bluntly, is the proposed solution searching for a problem? 1. Settings where FL is helpful—such as medical images across hospitals [1], user-generated text across mobile phones [2]—are often where the data distributions aren’t covered by the pre-training da
1. The proposed approach significantly reduces communication costs compared to traditional parameter transmission. 2. By leveraging foundation models to synthesize proxy data, the authors effectively mitigate the client-shift problem. 3. A variety of experimental settings across four datasets demonstrate the robustness and effectiveness of the proposed method.
1. The training framework is predominantly tailored for image datasets, limiting its applicability. 2. The method heavily depends on the congruence between the captioning and generative models, making it challenging to ensure the proxy dataset's distribution aligns with the private data. 3. The experimental setup, with only five clients, may not adequately represent real-world scenarios; expanding the evaluation to include 50 or 100 clients could provide more insightful results. 4. The compariso
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data
MethodsFocus
