Non-Convex Optimization in Federated Learning via Variance Reduction and   Adaptive Learning

Dipanwita Thakur; Antonella Guzzo; Giancarlo Fortino; Sajal K. Das

arXiv:2412.11660·cs.LG·December 17, 2024

Non-Convex Optimization in Federated Learning via Variance Reduction and Adaptive Learning

Dipanwita Thakur, Antonella Guzzo, Giancarlo Fortino, Sajal K. Das

PDF

Open Access

TL;DR

This paper introduces a federated learning algorithm that combines momentum-based variance reduction with adaptive learning rates, significantly improving convergence speed and reducing communication costs in non-convex, heterogeneous data settings.

Contribution

It presents a novel federated algorithm that effectively reduces communication complexity to (psilon^{-1}) and mitigates client drift, outperforming existing methods in non-convex federated learning.

Findings

01

Achieves (psilon^{-1}) communication complexity for convergence.

02

Demonstrates improved test accuracy on MNIST and CIFAR-10 datasets.

03

Effectively mitigates client drift in heterogeneous data environments.

Abstract

This paper proposes a novel federated algorithm that leverages momentum-based variance reduction with adaptive learning to address non-convex settings across heterogeneous data. We intend to minimize communication and computation overhead, thereby fostering a sustainable federated learning system. We aim to overcome challenges related to gradient variance, which hinders the model's efficiency, and the slow convergence resulting from learning rate adjustments with heterogeneous data. The experimental results on the image classification tasks with heterogeneous data reveal the effectiveness of our suggested algorithms in non-convex settings with an improved communication complexity of $O (ϵ^{- 1})$ to converge to an $ϵ$ -stationary point - compared to the existing communication complexity $O (ϵ^{- 2})$ of most prior works. The proposed federated version…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPrivacy-Preserving Technologies in Data · Stochastic Gradient Optimization Techniques · Machine Learning and ELM