Temporal-Differential Learning in Continuous Environments

Tao Bian; Zhong-Ping Jiang

arXiv:2006.00997·cs.LG·June 2, 2020·1 cites

Temporal-Differential Learning in Continuous Environments

Tao Bian, Zhong-Ping Jiang

PDF

Open Access

TL;DR

This paper introduces a novel reinforcement learning approach called temporal-differential learning, designed for continuous environments, with new methods and theoretical and empirical validation of its effectiveness.

Contribution

It presents the temporal-differential learning framework and develops continuous-time least squares policy evaluation and temporal-differential methods, advancing RL in continuous settings.

Findings

01

The proposed methods outperform traditional approaches in continuous environments.

02

Theoretical analysis confirms convergence properties.

03

Empirical results demonstrate improved policy evaluation accuracy.

Abstract

In this paper, a new reinforcement learning (RL) method known as the method of temporal differential is introduced. Compared to the traditional temporal-difference learning method, it plays a crucial role in developing novel RL techniques for continuous environments. In particular, the continuous-time least squares policy evaluation (CT-LSPE) and the continuous-time temporal-differential (CT-TD) learning methods are developed. Both theoretical and empirical evidences are provided to demonstrate the effectiveness of the proposed temporal-differential learning methodology.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Adaptive Dynamic Programming Control · Extremum Seeking Control Systems