Stochastic Gradient Descent under Markovian Sampling Schemes

Mathieu Even

arXiv:2302.14428·math.OC·June 26, 2023·1 cites

Stochastic Gradient Descent under Markovian Sampling Schemes

Mathieu Even

PDF

Open Access 1 Video

TL;DR

This paper analyzes the convergence of stochastic gradient descent algorithms that operate under Markovian sampling schemes, providing theoretical bounds and introducing a variance-reduced variant for improved efficiency.

Contribution

It establishes the first lower bounds for Markov chain-based SGD and proposes MC-SAG, a variance-reduced, communication-efficient algorithm under minimal assumptions.

Findings

01

Derived the theoretical lower bound involving Markov chain hitting time.

02

Proved convergence results under mild regularity assumptions.

03

Introduced MC-SAG with variance reduction and improved efficiency.

Abstract

We study a variation of vanilla stochastic gradient descent where the optimizer only has access to a Markovian sampling scheme. These schemes encompass applications that range from decentralized optimization with a random walker (token algorithms), to RL and online system identification problems. We focus on obtaining rates of convergence under the least restrictive assumptions possible on the underlying Markov chain and on the functions optimized. We first unveil the theoretical lower bound for methods that sample stochastic gradients along the path of a Markov chain, making appear a dependency in the hitting time of the underlying Markov chain. We then study Markov chain SGD (MC-SGD) under much milder regularity assumptions than prior works (e.g., no bounded gradients or domain, and infinite state spaces). We finally introduce MC-SAG, an alternative to MC-SGD with variance reduction,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Stochastic Gradient Descent under Markovian Sampling Schemes· slideslive

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Markov Chains and Monte Carlo Methods · Privacy-Preserving Technologies in Data

MethodsStochastic Gradient Descent