Tighter Lower Bounds for Shuffling SGD: Random Permutations and Beyond

Jaeyoung Cha; Jaewook Lee; Chulhee Yun

arXiv:2303.07160·cs.LG·June 12, 2023·1 cites

Tighter Lower Bounds for Shuffling SGD: Random Permutations and Beyond

Jaeyoung Cha, Jaewook Lee, Chulhee Yun

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper establishes tight convergence lower bounds for shuffling SGD, including random reshuffling and arbitrary permutations, improving understanding of the algorithm's fundamental limits in convex optimization.

Contribution

It provides the first tight lower bounds for weighted average iterates in shuffling SGD, closing the gap with upper bounds and extending results to arbitrary permutations.

Findings

01

Lower bounds with tighter condition number dependence for random reshuffling.

02

First to match upper bounds for weighted average iterates in convex and strongly convex cases.

03

Bounds apply to all permutation-based SGD variants with optimal permutation choices.

Abstract

We study convergence lower bounds of without-replacement stochastic gradient descent (SGD) for solving smooth (strongly-)convex finite-sum minimization problems. Unlike most existing results focusing on final iterate lower bounds in terms of the number of components $n$ and the number of epochs $K$ , we seek bounds for arbitrary weighted average iterates that are tight in all factors including the condition number $κ$ . For SGD with Random Reshuffling, we present lower bounds that have tighter $κ$ dependencies than existing bounds. Our results are the first to perfectly close the gap between lower and upper bounds for weighted average iterates in both strongly-convex and convex cases. We also prove weighted average iterate lower bounds for arbitrary permutation-based SGD, which apply to all variants that carefully choose the best permutation. Our bounds improve the existing…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

garywei944/grab-sampler
pytorch

Videos

Tighter Lower Bounds for Shuffling SGD: Random Permutations and Beyond· slideslive

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Sparse and Compressive Sensing Techniques · Statistical Methods and Inference

MethodsStochastic Gradient Descent