Convergence and Stability of the Stochastic Proximal Point Algorithm with Momentum
Junhyung Lyle Kim, Panos Toulis, Anastasios Kyrillidis

TL;DR
This paper analyzes the convergence and stability of a stochastic proximal point algorithm with momentum, demonstrating improved convergence rates and broader stability conditions compared to stochastic gradient methods.
Contribution
It introduces and studies the stochastic proximal point algorithm with momentum, showing it achieves faster convergence and better stability properties.
Findings
SPPAM converges faster to a neighborhood than SPPA.
SPPAM has a more favorable dependence on problem constants than SGDM.
Wider range of hyperparameters ensures convergence for SPPAM.
Abstract
Stochastic gradient descent with momentum (SGDM) is the dominant algorithm in many optimization scenarios, including convex optimization instances and non-convex neural network training. Yet, in the stochastic setting, momentum interferes with gradient noise, often leading to specific step size and momentum choices in order to guarantee convergence, set aside acceleration. Proximal point methods, on the other hand, have gained much attention due to their numerical stability and elasticity against imperfect tuning. Their stochastic accelerated variants though have received limited attention: how momentum interacts with the stability of (stochastic) proximal point methods remains largely unstudied. To address this, we focus on the convergence and stability of the stochastic proximal point algorithm with momentum (SPPAM), and show that SPPAM allows a faster linear convergence to a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Sparse and Compressive Sensing Techniques · Privacy-Preserving Technologies in Data
