Gaussian Approximation and Multiplier Bootstrap for Polyak-Ruppert   Averaged Linear Stochastic Approximation with Applications to TD Learning

Sergey Samsonov; Eric Moulines; Qi-Man Shao; Zhuo-Song Zhang; Alexey; Naumov

arXiv:2405.16644·stat.ML·February 4, 2025

Gaussian Approximation and Multiplier Bootstrap for Polyak-Ruppert Averaged Linear Stochastic Approximation with Applications to TD Learning

Sergey Samsonov, Eric Moulines, Qi-Man Shao, Zhuo-Song Zhang, Alexey, Naumov

PDF

Open Access 1 Video

TL;DR

This paper develops a Gaussian approximation and bootstrap method for Polyak-Ruppert averaged stochastic approximation, providing finite-sample bounds and confidence intervals with applications to TD learning.

Contribution

It introduces a Berry-Esseen bound and a multiplier bootstrap approach for LSA, enabling accurate finite-sample inference in reinforcement learning.

Findings

01

Berry-Esseen bound for multivariate normal approximation

02

Non-asymptotic validity of bootstrap confidence intervals

03

Application to temporal difference learning

Abstract

In this paper, we obtain the Berry-Esseen bound for multivariate normal approximation for the Polyak-Ruppert averaged iterates of the linear stochastic approximation (LSA) algorithm with decreasing step size. Moreover, we prove the non-asymptotic validity of the confidence intervals for parameter estimation with LSA based on multiplier bootstrap. This procedure updates the LSA estimate together with a set of randomly perturbed LSA estimates upon the arrival of subsequent observations. We illustrate our findings in the setting of temporal difference learning with linear function approximation.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Gaussian Approximation and Multiplier Bootstrap for Polyak-Ruppert Averaged Linear Stochastic Approximation with Applications to TD Learning· slideslive

Taxonomy

TopicsNeural Networks and Applications · Target Tracking and Data Fusion in Sensor Networks · Distributed Sensor Networks and Detection Algorithms

MethodsSparse Evolutionary Training