Randomized Stochastic Gradient Descent Ascent

Othmane Sebbouh; Marco Cuturi; Gabriel Peyr\'e

arXiv:2111.13162·cs.LG·November 29, 2021·1 cites

Randomized Stochastic Gradient Descent Ascent

Othmane Sebbouh, Marco Cuturi, Gabriel Peyr\'e

PDF

Open Access

TL;DR

This paper introduces RSGDA, a stochastic gradient ascent descent algorithm with randomized inner loop sizes, providing the first almost sure convergence guarantees for nonconvex-strongly-concave min/max problems, and demonstrates its effectiveness on robust optimization and optimal transport tasks.

Contribution

RSGDA offers a novel stochastic loop size approach with simplified analysis and proven convergence rates for challenging nonconvex min/strongly-concave max problems.

Findings

01

Achieves almost sure convergence in nonconvex-strongly-concave settings.

02

Provides optimal loop size parameters for best convergence.

03

Effective in distributionally robust optimization and optimal transport applications.

Abstract

An increasing number of machine learning problems, such as robust or adversarial variants of existing algorithms, require minimizing a loss function that is itself defined as a maximum. Carrying a loop of stochastic gradient ascent (SGA) steps on the (inner) maximization problem, followed by an SGD step on the (outer) minimization, is known as Epoch Stochastic Gradient \textit{Descent Ascent} (ESGDA). While successful in practice, the theoretical analysis of ESGDA remains challenging, with no clear guidance on choices for the inner loop size nor on the interplay between inner/outer step sizes. We propose RSGDA (Randomized SGDA), a variant of ESGDA with stochastic loop size with a simpler theoretical analysis. RSGDA comes with the first (among SGDA algorithms) almost sure convergence rates when used on nonconvex min/strongly-concave max settings. RSGDA can be parameterized using optimal…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Machine Learning and Algorithms · Markov Chains and Monte Carlo Methods

MethodsStochastic Gradient Descent