Non-Convex Optimization in Federated Learning via Variance Reduction and Adaptive Learning
Dipanwita Thakur, Antonella Guzzo, Giancarlo Fortino, Sajal K. Das

TL;DR
This paper introduces a federated learning algorithm that combines momentum-based variance reduction with adaptive learning rates, significantly improving convergence speed and reducing communication costs in non-convex, heterogeneous data settings.
Contribution
It presents a novel federated algorithm that effectively reduces communication complexity to (psilon^{-1}) and mitigates client drift, outperforming existing methods in non-convex federated learning.
Findings
Achieves (psilon^{-1}) communication complexity for convergence.
Demonstrates improved test accuracy on MNIST and CIFAR-10 datasets.
Effectively mitigates client drift in heterogeneous data environments.
Abstract
This paper proposes a novel federated algorithm that leverages momentum-based variance reduction with adaptive learning to address non-convex settings across heterogeneous data. We intend to minimize communication and computation overhead, thereby fostering a sustainable federated learning system. We aim to overcome challenges related to gradient variance, which hinders the model's efficiency, and the slow convergence resulting from learning rate adjustments with heterogeneous data. The experimental results on the image classification tasks with heterogeneous data reveal the effectiveness of our suggested algorithms in non-convex settings with an improved communication complexity of to converge to an -stationary point - compared to the existing communication complexity of most prior works. The proposed federated version…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Stochastic Gradient Optimization Techniques · Machine Learning and ELM
