Federated Learning Using Variance Reduced Stochastic Gradient for Probabilistically Activated Agents
M. R. Rostami, S. S. Kia

TL;DR
This paper introduces a federated learning algorithm that combines variance reduction with probabilistic agent activation, achieving faster convergence and improved efficiency in distributed, privacy-sensitive environments.
Contribution
It presents a novel two-layer federated learning algorithm that incorporates variance reduction and probabilistic agent selection, with proven convergence improvements over existing methods.
Findings
Convergence rate improved from O(1/√K) to O(1/K) with constant step-size.
Algorithm effectively reduces variance in stochastic gradient updates.
Numerical experiments demonstrate enhanced performance and faster convergence.
Abstract
This paper proposes an algorithm for Federated Learning (FL) with a two-layer structure that achieves both variance reduction and a faster convergence rate to an optimal solution in the setting where each agent has an arbitrary probability of selection in each iteration. In distributed machine learning, when privacy matters, FL is a functional tool. Placing FL in an environment where it has some irregular connections of agents (devices), reaching a trained model in both an economical and quick way can be a demanding job. The first layer of our algorithm corresponds to the model parameter propagation across agents done by the server. In the second layer, each agent does its local update with a stochastic and variance-reduced technique called Stochastic Variance Reduced Gradient (SVRG). We leverage the concept of variance reduction from stochastic optimization when the agents want to do…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Stochastic Gradient Optimization Techniques · Distributed Sensor Networks and Detection Algorithms
