Pseudorehearsal in value function approximation

Vladimir Marochko; Leonard Johard; Manuel Mazzara

arXiv:1703.07075·cs.AI·March 22, 2017·1 cites

Pseudorehearsal in value function approximation

Vladimir Marochko, Leonard Johard, Manuel Mazzara

PDF

Open Access

TL;DR

This paper investigates pseudorehearsal techniques to mitigate catastrophic forgetting in reinforcement learning, demonstrating their effectiveness in a pole balancing task with proper parameter initialization.

Contribution

It compares various pseudorehearsal methods for Q-learning with function approximation, highlighting their benefits in simple reinforcement learning tasks.

Findings

01

Pseudorehearsal aids learning in non-stationary data environments.

02

Proper initialization of rehearsal parameters is crucial for effectiveness.

03

Pseudorehearsal shows promise even in simple RL problems.

Abstract

Catastrophic forgetting is of special importance in reinforcement learning, as the data distribution is generally non-stationary over time. We study and compare several pseudorehearsal approaches for Q-learning with function approximation in a pole balancing task. We have found that pseudorehearsal seems to assist learning even in such very simple problems, given proper initialization of the rehearsal parameters.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Evolutionary Algorithms and Applications · Advanced Bandit Algorithms Research

MethodsQ-Learning