Offline RL via Feature-Occupancy Gradient Ascent

Gergely Neu; Nneka Okolo

arXiv:2405.13755·cs.LG·May 24, 2024

Offline RL via Feature-Occupancy Gradient Ascent

Gergely Neu, Nneka Okolo

PDF

Open Access

TL;DR

This paper introduces a new offline reinforcement learning algorithm based on feature-occupancy gradient ascent, which achieves optimal sample complexity and minimal data coverage assumptions in large, infinite-horizon MDPs with linear models.

Contribution

The paper develops a novel gradient ascent algorithm in feature occupancy space with strong theoretical guarantees and minimal data coverage requirements, advancing offline RL in linear MDPs.

Findings

01

Achieves optimal sample complexity scaling with accuracy

02

Requires only minimal data coverage assumptions

03

Easy to implement without prior coverage knowledge

Abstract

We study offline Reinforcement Learning in large infinite-horizon discounted Markov Decision Processes (MDPs) when the reward and transition models are linearly realizable under a known feature map. Starting from the classic linear-program formulation of the optimal control problem in MDPs, we develop a new algorithm that performs a form of gradient ascent in the space of feature occupancies, defined as the expected feature vectors that can potentially be generated by executing policies in the environment. We show that the resulting simple algorithm satisfies strong computational and sample complexity guarantees, achieved under the least restrictive data coverage assumptions known in the literature. In particular, we show that the sample complexity of our method scales optimally with the desired accuracy level and depends on a weak notion of coverage that only requires the empirical…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Data Classification