Finite Sample Analysis of LSTD with Random Projections and Eligibility Traces
Haifang Li, Yingce Xia, Wensheng Zhang

TL;DR
This paper introduces LSTD(λ)-RP, a new algorithm for policy evaluation in high-dimensional reinforcement learning, combining random projections and eligibility traces, with theoretical error bounds and improved performance over prior methods.
Contribution
The paper presents a novel LSTD(λ)-RP algorithm that integrates random projections and eligibility traces, providing theoretical analysis and demonstrating improved accuracy.
Findings
Provides upper bounds on estimation, approximation, and total errors.
Shows LSTD(λ)-RP outperforms previous LSTD-RP and LSTD(λ) algorithms.
Theoretically validates benefits of random projections and eligibility traces.
Abstract
Policy evaluation with linear function approximation is an important problem in reinforcement learning. When facing high-dimensional feature spaces, such a problem becomes extremely hard considering the computation efficiency and quality of approximations. We propose a new algorithm, LSTD()-RP, which leverages random projection techniques and takes eligibility traces into consideration to tackle the above two challenges. We carry out theoretical analysis of LSTD()-RP, and provide meaningful upper bounds of the estimation error, approximation error and total generalization error. These results demonstrate that LSTD()-RP can benefit from random projection and eligibility traces strategies, and LSTD()-RP can achieve better performances than prior LSTD-RP and LSTD() algorithms.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Evolutionary Algorithms and Applications · Robot Manipulation and Learning
