On the Convergence to a Global Solution of Shuffling-Type Gradient   Algorithms

Lam M. Nguyen; Trang H. Tran

arXiv:2206.05869·cs.LG·October 27, 2023·1 cites

On the Convergence to a Global Solution of Shuffling-Type Gradient Algorithms

Lam M. Nguyen, Trang H. Tran

PDF

Open Access 1 Video

TL;DR

This paper proves that shuffling SGD, a practical variant of stochastic gradient descent, converges to a global solution for certain non-convex functions in over-parameterized models, under relaxed assumptions.

Contribution

It provides the first convergence analysis of shuffling SGD for non-convex functions with relaxed assumptions, matching the complexity of convex cases.

Findings

01

Convergence to global solutions under over-parameterization

02

Relaxed non-convex assumptions compared to prior work

03

Maintains computational complexity similar to convex settings

Abstract

Stochastic gradient descent (SGD) algorithm is the method of choice in many machine learning tasks thanks to its scalability and efficiency in dealing with large-scale problems. In this paper, we focus on the shuffling version of SGD which matches the mainstream practical heuristics. We show the convergence to a global solution of shuffling SGD for a class of non-convex functions under over-parameterized settings. Our analysis employs more relaxed non-convex assumptions than previous literature. Nevertheless, we maintain the desired computational complexity as shuffling SGD has achieved in the general convex setting.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

On the Convergence to a Global Solution of Shuffling-Type Gradient Algorithms· slideslive

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Sparse and Compressive Sensing Techniques · Machine Learning and ELM