When to use parametric models in reinforcement learning?

Hado van Hasselt; Matteo Hessel; John Aslanides

arXiv:1906.05243·cs.LG·September 18, 2019·62 cites

When to use parametric models in reinforcement learning?

Hado van Hasselt, Matteo Hessel, John Aslanides

PDF

Open Access 2 Repos

TL;DR

This paper investigates the conditions under which parametric models are most effective in reinforcement learning, comparing them with experience replay methods and validating findings on Atari games.

Contribution

It provides a theoretical and empirical analysis of when replay-based methods outperform model-based approaches in reinforcement learning.

Findings

01

Replay-based algorithms can be more data-efficient than model-based ones under certain conditions.

02

The hypothesis was validated on Atari 2600 games, achieving state-of-the-art data efficiency.

03

Replay methods outperform parametric models in specific scenarios, especially with fictional transition generation.

Abstract

We examine the question of when and how parametric models are most useful in reinforcement learning. In particular, we look at commonalities and differences between parametric models and experience replay. Replay-based learning algorithms share important traits with model-based approaches, including the ability to plan: to use more computation without additional data to improve predictions and behaviour. We discuss when to expect benefits from either approach, and interpret prior work in this context. We hypothesise that, under suitable conditions, replay-based algorithms should be competitive to or better than model-based algorithms if the model is used only to generate fictional transitions from observed states for an update rule that is otherwise model-free. We validated this hypothesis on Atari 2600 video games. The replay-based algorithm attained state-of-the-art data efficiency,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Advanced Control Systems Optimization · Evolutionary Algorithms and Applications