Stochastic First-Order Methods with Non-smooth and Non-Euclidean   Proximal Terms for Nonconvex High-Dimensional Stochastic Optimization

Yue Xie; Jiawen Bi; Hongcheng Liu

arXiv:2406.19475·math.OC·October 1, 2024

Stochastic First-Order Methods with Non-smooth and Non-Euclidean Proximal Terms for Nonconvex High-Dimensional Stochastic Optimization

Yue Xie, Jiawen Bi, Hongcheng Liu

PDF

Open Access

TL;DR

This paper introduces dimension-insensitive stochastic first-order methods for nonconvex stochastic optimization, achieving improved sample complexity bounds and accommodating non-smooth, non-Euclidean proximal terms, with practical numerical validation.

Contribution

The work develops novel DISFOM algorithms that are dimension-insensitive and handle non-smooth, non-Euclidean proximal functions, with enhanced theoretical sample complexity bounds.

Findings

01

Sample complexity of O((log d)/ε^4) for minibatch-based DISFOM.

02

Variance reduction improves complexity to O((log d)^{2/3}/ε^{10/3}).

03

Numerical experiments confirm the dimension-insensitive property.

Abstract

When the nonconvex problem is complicated by stochasticity, the sample complexity of stochastic first-order methods may depend linearly on the problem dimension, which is undesirable for large-scale problems. In this work, we propose dimension-insensitive stochastic first-order methods (DISFOMs) to address nonconvex optimization with expected-valued objective function. Our algorithms allow for non-Euclidean and non-smooth distance functions as the proximal terms. Under mild assumptions, we show that DISFOM using minibatches to estimate the gradient enjoys sample complexity of $O ((lo g d) / ϵ^{4})$ to obtain an $ϵ$ -stationary point. Furthermore, we prove that DISFOM employing variance reduction can sharpen this bound to $O ((lo g d)^{2/3} / ϵ^{10/3})$ , which perhaps leads to the best-known sample complexity result in terms of $d$ . We provide…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Sparse and Compressive Sensing Techniques · Advanced Optimization Algorithms Research