Disentangling data distribution for Federated Learning
Xinyuan Zhao, Hanlin Gu, Lixin Fan, Yuxing Han, Qiang Yang

TL;DR
This paper introduces FedDistr, a novel federated learning algorithm that uses diffusion models to disentangle data distributions, enabling efficient training with minimal communication while preserving privacy.
Contribution
The paper presents a new method employing diffusion models to disentangle data distributions in federated learning, improving efficiency and privacy.
Findings
FedDistr achieves comparable efficiency to distributed systems with one communication round.
Empirical results show significant utility improvements on CIFAR100 and DomainNet.
The method outperforms traditional federated learning approaches in disentangled scenarios.
Abstract
Federated Learning (FL) facilitates collaborative training of a global model whose performance is boosted by private data owned by distributed clients, without compromising data privacy. Yet the wide applicability of FL is hindered by entanglement of data distributions across different clients. This paper demonstrates for the first time that by disentangling data distributions FL can in principle achieve efficiencies comparable to those of distributed systems, requiring only one round of communication. To this end, we propose a novel FedDistr algorithm, which employs stable diffusion models to decouple and recover data distributions. Empirical results on the CIFAR100 and DomainNet datasets show that FedDistr significantly enhances model utility and efficiency in both disentangled and near-disentangled scenarios while ensuring privacy, outperforming traditional federated learning methods.
Peer Reviews
Decision·ICLR 2025 Conference Withdrawn Submission
1. Both communication cost and heterogeneity are important challenges in federated learning; 2. Authors verify the efficacy of the proposed algorithm through numerical experiments; 3. The proposed algorithm saves communication cost compared to baseline federated learning algorithms with little utility loss.
1. The paper is not well-motivated. In federated learning, heterogeneity is a more common term compared to the so-call disentanglement. Furthermore, in the first paragraph, authors claims 'There is a consensus that this inefficiency stems from the entanglement of data distribution across clients', can you provide references to this claim of 'consensus'. In fact, in the later part of the introduction, authors show that it is the disentanglement where classical FL algorithms like FedAvg does not p
1. The paper is easy to follow. 2. The experiment results look good.
1. The proposed method may have the potential to leak privacy. 2. Some related work on methods that transmit knowledge instead of models between clients and the server may need to be reviewed and discussed. 3. There are some minor spelling and grammatical errors.
- Novel application of diffusion models to disentangle and aggregate data distributions in federated learning. - Demonstrates some improvements in communication efficiency and utility for specific datasets.
- The approach is similar to previous approaches [1, 2] but does not discuss them. - The theoretical results are questionable and assumptions are not clearly stated. - The results seem to contradict previous theoretical results [1]. - The description of the approach omits important details, in particular, how the local models are trained and if local models can differ.
The primary, and possibly only, strength of the paper is the ability to achieve high accuracy scores with only one communication round in a heterogeneous federated learning scenario; this, followed by theoretical support proved in the appendix, the authors proposed a unique solution to a commonly known problem in federated learning. The idea of disentangling the local data is original and significant.
The weaknesses of the paper are all indirectly related to their contribution. The writing lacks extreme amounts of clarity all around. The authors are not clearly explaining concepts and do not mention how extremely similar the core of their work is to Liang et al. (2024). They cited the paper in a small section of the paper, but it is much more significant than they lead the readers to believe and because of this it needs to be introduced in the related works section with a proper break down. T
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Cryptography and Data Security · Stochastic Gradient Optimization Techniques
MethodsDiffusion
