Debiasing Federated Learning with Correlated Client Participation

Zhenyu Sun; Ziyang Zhang; Zheng Xu; Gauri Joshi; Pranay Sharma; Ermin; Wei

arXiv:2410.01209·cs.LG·October 3, 2024

Debiasing Federated Learning with Correlated Client Participation

Zhenyu Sun, Ziyang Zhang, Zheng Xu, Gauri Joshi, Pranay Sharma, Ermin, Wei

PDF

Open Access 1 Video 3 Reviews

TL;DR

This paper models client participation in federated learning as a Markov chain to analyze and mitigate bias caused by correlated client availability, proposing a debiasing algorithm with proven convergence.

Contribution

It introduces a Markov chain-based framework for analyzing client participation and proposes a novel debiasing algorithm for FedAvg under correlated client availability.

Findings

01

Increasing minimum separation reduces bias in client participation.

02

The proposed debiasing algorithm converges to the unbiased optimal solution.

03

Empirical results confirm theoretical analysis.

Abstract

In cross-device federated learning (FL) with millions of mobile clients, only a small subset of clients participate in training in every communication round, and Federated Averaging (FedAvg) is the most popular algorithm in practice. Existing analyses of FedAvg usually assume the participating clients are independently sampled in each round from a uniform distribution, which does not reflect real-world scenarios. This paper introduces a theoretical framework that models client participation in FL as a Markov chain to study optimization convergence when clients have non-uniform and correlated participation across rounds. We apply this framework to analyze a more general and practical pattern: every client must wait a minimum number of $R$ rounds (minimum separation) before re-participating. We theoretically prove and empirically observe that increasing minimum separation reduces the bias…

Peer Reviews

Decision·ICLR 2025 Poster

Reviewer 01Rating 8Confidence 4

Strengths

1. The authors formulates the client participation process as a R-th order Markov chain. 2. The authors proposed the debiasing FedAvg algorithm based on the estimation of marginal stationary distribution of clients to be sampled. 3. The authors provided the convergence analysis of both FedAvg (to indicate the problem) and the proposed algorithm which can converge.

Weaknesses

1. The paper is not well written and there are some notations not explained, e.g., $\tau_{mix}$ (is it the mixing time?) and $p_e$, although the paper presented quite a few interesting ideas. 2. The authors discussed quite a few limitations of the proposed approach and its proofs. These seem the weaknesses of the paper.

Reviewer 02Rating 8Confidence 3

Strengths

1. The authors introduce a theoretical framework that models client participation in FL as a Markov chain, allowing the study of optimization convergence when when each client must wait at least $R$ rounds before participating again and has its own availability probability. 2. Through both theoretical and empirical results, the authors find that due to non-uniformity and time correlation effects, FL algorithms converge with asymptotic bias, which can be reduced by increasing the minimum separati

Weaknesses

1. The authors restrict the choices of $R$ to range from $0$ to $M-1$. However, the theoretical analysis only considers cases where $R$ ranges from $0$ to $M-2$. It would be beneficial to include the results for $R=M-1$. 2. In the experiments, the authors simplify the algorithm by partitioning the $N$ clients into $M$ groups, with exactly one group selected in each round. This setup does not align with the more complex proposed algorithm and is insufficient for a comprehensive evaluation of its

Reviewer 03Rating 6Confidence 3

Strengths

1. The authors find common FL assumption that clients participate independently and uniformly is unrealistic. 2. The paper frames client participation as a Markov process, capturing real-world constraints and interdependencies among clients. 3. The paper proposes Debiasing FedAvg converging to an unbiased solution with theoretical analysis. 4. Experiments on both synthetic and real datasets validate the algorithm’s effectiveness.

Weaknesses

1. The paper claims that a larger minimum separation $R$ reduces bias. However, it lacks a discussion of how $R$ affects the server's model performance on the test set empirically and how to choose the best $R$. 2. The paper assumes a uniform minimum separation for all clients, which may not reflect real-world situations.

Videos

Debiasing Federated Learning with Correlated Client Participation· slideslive

Taxonomy

TopicsPrivacy-Preserving Technologies in Data · Privacy, Security, and Data Protection · Internet Traffic Analysis and Secure E-voting