Information-Theoretic Considerations in Batch Reinforcement Learning

Jinglin Chen; Nan Jiang

arXiv:1905.00360·cs.LG·May 2, 2019·49 cites

Information-Theoretic Considerations in Batch Reinforcement Learning

Jinglin Chen, Nan Jiang

PDF

Open Access

TL;DR

This paper explores the theoretical foundations of batch reinforcement learning, analyzing the assumptions needed for value-function approximation guarantees and clarifying their necessity and naturalness.

Contribution

It provides new theoretical insights into the assumptions underlying finite sample guarantees in batch RL, addressing their necessity and conditions for validity.

Findings

01

Clarifies the necessity of distribution shift assumptions

02

Analyzes the naturalness of representation conditions

03

Advances understanding of value-function approximation in RL

Abstract

Value-function approximation methods that operate in batch mode have foundational importance to reinforcement learning (RL). Finite sample guarantees for these methods often crucially rely on two types of assumptions: (1) mild distribution shift, and (2) representation conditions that are stronger than realizability. However, the necessity ("why do we need them?") and the naturalness ("when do they hold?") of such assumptions have largely eluded the literature. In this paper, we revisit these assumptions and provide theoretical results towards answering the above questions, and make steps towards a deeper understanding of value-function approximation.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Optimization and Search Problems · Scheduling and Optimization Algorithms