True Online TD-Replan(lambda) Achieving Planning through Replaying

Abdulrahman Altahhan

arXiv:2501.19027·cs.LG·February 3, 2025

True Online TD-Replan(lambda) Achieving Planning through Replaying

Abdulrahman Altahhan

PDF

Open Access

TL;DR

This paper introduces True Online TD-Replan(λ), a novel planning method that enhances experience replay efficiency by integrating replay density control via the λ parameter, outperforming existing methods in benchmark tests.

Contribution

The paper presents a new planning algorithm that extends true online TD(λ) with experience replay capabilities controlled by λ, improving performance over similar quadratic complexity methods.

Findings

01

Outperforms true online TD(λ) in experience replay tasks

02

Surpasses Dyna Planning and TD(λ)-Replan algorithms in benchmarks

03

Effective in both simple and complex feature environments

Abstract

In this paper, we develop a new planning method that extends the capabilities of the true online TD to allow an agent to efficiently replay all or part of its past experience, online in the sequence that they appear with, either in each step or sparsely according to the usual {\lambda} parameter. In this new method that we call True Online TD-Replan({\lambda}), the {\lambda} parameter plays a new role in specifying the density of the replay process in addition to the usual role of specifying the depth of the target's updates. We demonstrate that, for problems that benefit from experience replay, our new method outperforms true online TD({\lambda}), albeit quadratic in complexity due to its replay capabilities. In addition, we demonstrate that our method outperforms other methods with similar quadratic complexity such as Dyna Planning and TD({\lambda})-Replan algorithms. We test our…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsFormal Methods in Verification · Logic, programming, and type systems · Software Testing and Debugging Techniques