RanSOM: Second-Order Momentum with Randomized Scaling for Constrained and Unconstrained Optimization
El Mahdi Chayti

TL;DR
RanSOM introduces a randomized momentum framework that corrects bias in stochastic optimization, achieving optimal convergence rates without expensive auxiliary sampling.
Contribution
It proposes a unified, unbiased momentum method using randomized steps, improving convergence rates in constrained and unconstrained optimization.
Findings
Achieves $ ilde{O}(rac{1}{ ext{epsilon}^3})$ convergence rate.
Handles heavy-tailed noise with optimal rates.
Eliminates bias with a single Hessian-vector product.
Abstract
Momentum methods, such as Polyak's Heavy Ball, are the standard for training deep networks but suffer from curvature-induced bias in stochastic settings, limiting convergence to suboptimal rates. Existing corrections typically require expensive auxiliary sampling or restrictive smoothness assumptions. We propose \textbf{RanSOM}, a unified framework that eliminates this bias by replacing deterministic step sizes with randomized steps drawn from distributions with mean . This modification allows us to leverage Stein-type identities to compute an exact, unbiased estimate of the momentum bias using a single Hessian-vector product computed jointly with the gradient, avoiding auxiliary queries. We instantiate this framework in two algorithms: \textbf{RanSOM-E} for unconstrained optimization (using exponentially distributed steps) and \textbf{RanSOM-B} for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Gaussian Processes and Bayesian Inference · Markov Chains and Monte Carlo Methods
