Asynchronous Stochastic Composition Optimization with Variance Reduction
Shuheng Shen, Linli Xu, Jingchang Liu, Junliang Guo, Qing Ling

TL;DR
This paper introduces two asynchronous parallel algorithms with variance reduction for large-scale stochastic composition optimization, achieving linear convergence and speedup in distributed machine learning settings.
Contribution
The paper proposes novel asynchronous parallel variance reduced algorithms for composition optimization, suitable for large-scale data and distributed architectures, with proven convergence and speedup.
Findings
Algorithms achieve linear convergence rates.
Algorithms enjoy provable linear speedup under certain conditions.
Experiments verify effectiveness on large-scale problems.
Abstract
Composition optimization has drawn a lot of attention in a wide variety of machine learning domains from risk management to reinforcement learning. Existing methods solving the composition optimization problem often work in a sequential and single-machine manner, which limits their applications in large-scale problems. To address this issue, this paper proposes two asynchronous parallel variance reduced stochastic compositional gradient (AsyVRSC) algorithms that are suitable to handle large-scale data sets. The two algorithms are AsyVRSC-Shared for the shared-memory architecture and AsyVRSC-Distributed for the master-worker architecture. The embedded variance reduction techniques enable the algorithms to achieve linear convergence rates. Furthermore, AsyVRSC-Shared and AsyVRSC-Distributed enjoy provable linear speedup, when the time delays are bounded by the data dimensionality or the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Machine Learning and ELM · Advanced Neural Network Applications
