Some Simulation Results for Emphatic Temporal-Difference Learning   Algorithms

Huizhen Yu

arXiv:1605.02099·cs.LG·May 10, 2016·2 cites

Some Simulation Results for Emphatic Temporal-Difference Learning Algorithms

Huizhen Yu

PDF

Open Access

TL;DR

This paper provides simulation results demonstrating the behavior of constrained emphatic temporal-difference (ETD) algorithms across three example problems, complementing previous theoretical convergence analyses.

Contribution

It offers empirical insights into ETD algorithms' performance, supplementing prior theoretical work with practical simulation data.

Findings

01

ETD algorithms exhibit stable behavior in simulated environments

02

Simulation results align with theoretical convergence properties

03

Behavior varies across different example problems

Abstract

This is a companion note to our recent study of the weak convergence properties of constrained emphatic temporal-difference learning (ETD) algorithms from a theoretic perspective. It supplements the latter analysis with simulation results and illustrates the behavior of some of the ETD algorithms using three example problems.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Adaptive Filtering Techniques · Machine Learning and ELM · Neural Networks and Applications