Between Rate-Distortion Theory & Value Equivalence in Model-Based   Reinforcement Learning

Dilip Arumugam; Benjamin Van Roy

arXiv:2206.02025·cs.LG·June 7, 2022

Between Rate-Distortion Theory & Value Equivalence in Model-Based Reinforcement Learning

Dilip Arumugam, Benjamin Van Roy

PDF

Open Access

TL;DR

This paper explores how rate-distortion theory can be applied to approximate value equivalence in model-based reinforcement learning, enabling near-optimal decision-making with simplified environment models under capacity constraints.

Contribution

It introduces an algorithm that synthesizes approximate environment models using rate-distortion principles, addressing the challenge of intractable environment complexity and limited agent capacity.

Findings

01

Proposes a formal framework for approximate value equivalence.

02

Demonstrates how rate-distortion theory guides environment model compression.

03

Shows potential for near-optimal policies with simplified models.

Abstract

The quintessential model-based reinforcement-learning agent iteratively refines its estimates or prior beliefs about the true underlying model of the environment. Recent empirical successes in model-based reinforcement learning with function approximation, however, eschew the true model in favor of a surrogate that, while ignoring various facets of the environment, still facilitates effective planning over behaviors. Recently formalized as the value equivalence principle, this algorithmic technique is perhaps unavoidable as real-world reinforcement learning demands consideration of a simple, computationally-bounded agent interacting with an overwhelmingly complex environment. In this work, we entertain an extreme scenario wherein some combination of immense environment complexity and limited agent capacity entirely precludes identifying an exactly value-equivalent model. In light of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics