# Some Limit Properties of Markov Chains Induced by Stochastic Recursive   Algorithms

**Authors:** Abhishek Gupta, Hao Chen, Jianzong Pi, Gaurav Tendolkar

arXiv: 1904.10778 · 2020-07-27

## TL;DR

This paper studies the limit properties of Markov chains generated by stochastic recursive algorithms, showing convergence to deterministic contraction trajectories and invariant distributions, with applications to machine learning and dynamic programming.

## Contribution

It establishes weak convergence and ergodic properties of Markov chains induced by iterated random operators, extending understanding of stochastic recursive algorithms.

## Key findings

- Random sequences converge weakly to contraction operator trajectories.
- Time averages of the sequences converge to invariant distribution means.
- Applications include logistic regression and dynamic programming algorithms.

## Abstract

Recursive stochastic algorithms have gained significant attention in the recent past due to data driven applications. Examples include stochastic gradient descent for solving large-scale optimization problems and empirical dynamic programming algorithms for solving Markov decision problems. These recursive stochastic algorithms approximate certain contraction operators and can be viewed within the framework of iterated random operators. Accordingly, we consider iterated random operators over a Polish space that simulate iterated contraction operator over that Polish space. Assume that the iterated random operators are indexed by certain batch sizes such that as batch sizes grow to infinity, each realization of the random operator converges (in some sense) to the contraction operator it is simulating. We show that starting from the same initial condition, the distribution of the random sequence generated by the iterated random operators converges weakly to the trajectory generated by the contraction operator. We further show that under certain conditions, the time average of the random sequence converges to the spatial mean of the invariant distribution. We then apply these results to logistic regression, empirical value iteration, and empirical Q value iteration for finite state finite action MDPs to illustrate the general theory develop here.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1904.10778/full.md

## Figures

9 figures with captions in the complete paper: https://tomesphere.com/paper/1904.10778/full.md

## References

94 references — full list in the complete paper: https://tomesphere.com/paper/1904.10778/full.md

---
Source: https://tomesphere.com/paper/1904.10778