FedImpro: Measuring and Improving Client Update in Federated Learning

Zhenheng Tang; Yonggang Zhang; Shaohuai Shi; Xinmei Tian; Tongliang; Liu; Bo Han; Xiaowen Chu

arXiv:2402.07011·cs.LG·March 15, 2024·2 cites

FedImpro: Measuring and Improving Client Update in Federated Learning

Zhenheng Tang, Yonggang Zhang, Shaohuai Shi, Xinmei Tian, Tongliang, Liu, Bo Han, Xiaowen Chu

PDF

Open Access 3 Reviews

TL;DR

FedImpro introduces a novel method for mitigating client drift in federated learning by reconstructing feature distributions and decoupling model components, leading to improved generalization across heterogeneous data sources.

Contribution

This paper presents FedImpro, a new approach that constructs similar conditional distributions to reduce client dissimilarity and improve federated learning performance.

Findings

01

FedImpro reduces gradient dissimilarity in FL.

02

It enhances model generalization under data heterogeneity.

03

Experimental results confirm improved performance with FedImpro.

Abstract

Federated Learning (FL) models often experience client drift caused by heterogeneous data, where the distribution of data differs across clients. To address this issue, advanced research primarily focuses on manipulating the existing gradients to achieve more consistent client models. In this paper, we present an alternative perspective on client drift and aim to mitigate it by generating improved local models. First, we analyze the generalization contribution of local training and conclude that this generalization contribution is bounded by the conditional Wasserstein distance between the data distribution of different clients. Then, we propose FedImpro, to construct similar conditional distributions for local training. Specifically, FedImpro decouples the model into high-level and low-level components, and trains the high-level portion on reconstructed feature distributions. This…

Peer Reviews

Decision·ICLR 2024 poster

Reviewer 01Rating 8· accept, good paperConfidence 3

Strengths

- The paper addresses a very relevant issue for the FL community, i.e. limiting the negative effects of the client drift in heterogeneous settings. - The paper is well written and easy to follow - Very detailed discussion of related works - Theoretical claims supported by proofs - Extensive empirical analysis. FedImpro is compared with some state-of-the-art approaches in terms of final performance, convergence speed, weight divergence. Interesting ablation study on the depth of gradient decoupli

Weaknesses

- My main concern regards the feasibility of deploying FedImpro in real-world contexts. FedImpro notably increases both the number of communications between clients and server, and the message size. The paper points out how the global distribution can be estimated using methods which impact the communication network less, but that does not eliminate the need for additional communication. - Some relevant related works are not discussed: ETF [1], SphereFed [2], FedSpeed [3]. - FedImpro is compar

Reviewer 02Rating 6· marginally above the acceptance thresholdConfidence 3

Strengths

1. This paper has a good level of writing and it is easy to follow. The idea is easy to follow and understand. 2. This paper combine split training with feature sharing to improve the generalization of the model.

Weaknesses

1. I notice that the author ignore a very related and state-of-art baesline FedDyn [1], could the author conduct comparion experiments with FedDyn? 2. The timecomsuming for training the model increases for FedImpro. Could the author list the cpu-time cost comparion experiments to reach the target accuracy? [1] Acar, Durmus Alp Emre, et al. "Federated learning based on dynamic regularization." *arXiv preprint arXiv:2111.04263* (2021).

Reviewer 03Rating 8· accept, good paperConfidence 2

Strengths

1. The idea of generalization contribution in FL sounds novel. 2. Experimental performances of FedImpro look superior.

Weaknesses

The idea of having a lower-level and a higher-level neural network in FL is not new, i.e. the feature extraction network idea. I don't see many comparisons to these previous work in the experimental section.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPrivacy-Preserving Technologies in Data · Cloud Data Security Solutions · Access Control and Trust