Closing the gap between SVRG and TD-SVRG with Gradient Splitting

Arsenii Mustafin; Alex Olshevsky; Ioannis Ch. Paschalidis

arXiv:2211.16237·cs.LG·August 7, 2024

Closing the gap between SVRG and TD-SVRG with Gradient Splitting

Arsenii Mustafin, Alex Olshevsky, Ioannis Ch. Paschalidis

PDF

Open Access 1 Repo

TL;DR

This paper introduces a novel approach that combines TD learning with SVRG using gradient splitting, achieving a convergence rate comparable to SVRG in convex optimization, supported by theoretical analysis and experiments.

Contribution

It presents a new method that simplifies and fuses TD learning with SVRG, attaining a geometric convergence rate matching that of SVRG in convex settings.

Findings

01

Achieves geometric convergence rate of 1/8 with fixed learning rate

02

Theoretical convergence bound matches that of SVRG in convex optimization

03

Experimental results support the theoretical claims

Abstract

Temporal difference (TD) learning is a policy evaluation in reinforcement learning whose performance can be enhanced by variance reduction methods. Recently, multiple works have sought to fuse TD learning with Stochastic Variance Reduced Gradient (SVRG) method to achieve a geometric rate of convergence. However, the resulting convergence rate is significantly weaker than what is achieved by SVRG in the setting of convex optimization. In this work we utilize a recent interpretation of TD-learning as the splitting of the gradient of an appropriately chosen function, thus simplifying the algorithm and fusing TD with SVRG. Our main result is a geometric convergence bound with predetermined learning rate of $1/8$ , which is identical to the convergence bound available for SVRG in the convex setting. Our theoretical findings are supported by a set of experiments.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

gaarsmu/svrg_for_td_learning
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics