A Unifying View of Optimism in Episodic Reinforcement Learning

Gergely Neu; Ciara Pike-Burke

arXiv:2007.01891·cs.LG·July 7, 2020·1 cites

A Unifying View of Optimism in Episodic Reinforcement Learning

Gergely Neu, Ciara Pike-Burke

PDF

Open Access 2 Videos

TL;DR

This paper introduces a unified framework for optimism-based algorithms in episodic reinforcement learning, bridging model- and value-optimistic approaches through Lagrangian duality, enabling efficient implementation and analysis.

Contribution

It provides a general, duality-based framework that unifies and simplifies the design, analysis, and implementation of optimistic RL algorithms, including large-scale function approximation.

Findings

01

Unified framework for model- and value-optimistic algorithms

02

Efficient dynamic programming implementation

03

Applicability to large-scale problems with function approximation

Abstract

The principle of optimism in the face of uncertainty underpins many theoretically successful reinforcement learning algorithms. In this paper we provide a general framework for designing, analyzing and implementing such algorithms in the episodic reinforcement learning problem. This framework is built upon Lagrangian duality, and demonstrates that every model-optimistic algorithm that constructs an optimistic MDP has an equivalent representation as a value-optimistic dynamic programming algorithm. Typically, it was thought that these two classes of algorithms were distinct, with model-optimistic algorithms benefiting from a cleaner probabilistic analysis while value-optimistic algorithms are easier to implement and thus more practical. With the framework developed in this paper, we show that it is possible to get the best of both worlds by providing a class of algorithms which have a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

A Unifying View of Optimism in Episodic Reinforcement Learning· youtube

A Unifying View of Optimism in Episodic Reinforcement Learning· slideslive

Taxonomy

TopicsSupply Chain and Inventory Management · Energy Efficiency and Management · Reinforcement Learning in Robotics