Accelerated Primal-Dual Algorithm for Distributed Non-convex Optimization
Shengjun Zhang, Colleen P. Bailey

TL;DR
This paper introduces an accelerated distributed primal-dual stochastic gradient descent algorithm with a 'powerball' method, achieving linear speedup for non-convex optimization and demonstrating superior performance on neural network training tasks.
Contribution
It proposes a novel accelerated primal-dual SGD algorithm with a 'powerball' method that improves convergence speed for distributed non-convex optimization.
Findings
Achieves linear speedup convergence rate of O(1/√(nT))
Outperforms existing distributed SGD algorithms in experiments
Effective for training neural networks on MNIST
Abstract
This paper investigates accelerating the convergence of distributed optimization algorithms on non-convex problems. We propose a distributed primal-dual stochastic gradient descent~(SGD) equipped with "powerball" method to accelerate. We show that the proposed algorithm achieves the linear speedup convergence rate for general smooth (possibly non-convex) cost functions. We demonstrate the efficiency of the algorithm through numerical experiments by training two-layer fully connected neural networks and convolutional neural networks on the MNIST dataset to compare with state-of-the-art distributed SGD algorithms and centralized SGD algorithms.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Machine Learning and ELM · Sparse and Compressive Sensing Techniques
