A Unified Convergence Analysis for Shuffling-Type Gradient Methods

Lam M. Nguyen; Quoc Tran-Dinh; Dzung T. Phan; Phuong Ha Nguyen; Marten; van Dijk

arXiv:2002.08246·math.OC·September 21, 2021·26 cites

A Unified Convergence Analysis for Shuffling-Type Gradient Methods

Lam M. Nguyen, Quoc Tran-Dinh, Dzung T. Phan, Phuong Ha Nguyen, Marten, van Dijk

PDF

Open Access

TL;DR

This paper provides a comprehensive convergence analysis for various shuffling-type gradient methods in finite-sum optimization, covering both convex and nonconvex cases, with improved rates and practical insights.

Contribution

It introduces new convergence rates for shuffling gradient methods applicable to multiple sampling strategies, unifying and extending prior analyses.

Findings

01

New non-asymptotic and asymptotic convergence rates for nonconvex and convex settings.

02

Improved nonconvex convergence rate over existing methods.

03

Empirical validation on logistic regression and neural network training.

Abstract

In this paper, we propose a unified convergence analysis for a class of generic shuffling-type gradient methods for solving finite-sum optimization problems. Our analysis works with any sampling without replacement strategy and covers many known variants such as randomized reshuffling, deterministic or randomized single permutation, and cyclic and incremental gradient schemes. We focus on two different settings: strongly convex and nonconvex problems, but also discuss the non-strongly convex case. Our main contribution consists of new non-asymptotic and asymptotic convergence rates for a wide class of shuffling-type gradient methods in both nonconvex and convex settings. We also study uniformly randomized shuffling variants with different learning rates and model assumptions. While our rate in the nonconvex case is new and significantly improved over existing works under standard…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Sparse and Compressive Sensing Techniques · Machine Learning and ELM

MethodsLogistic Regression