Sample and Oracle Efficient Reinforcement Learning for MDPs with   Linearly-Realizable Value Functions

Zakaria Mhammedi

arXiv:2409.04840·cs.LG·October 4, 2024

Sample and Oracle Efficient Reinforcement Learning for MDPs with Linearly-Realizable Value Functions

Zakaria Mhammedi

PDF

Open Access

TL;DR

This paper introduces a computationally efficient reinforcement learning algorithm for large or infinite state-action spaces modeled by linearly-realizable value functions, improving sample and computational efficiency over prior methods.

Contribution

The paper presents a new RL algorithm that efficiently finds near-optimal policies in MDPs with linear value functions, requiring polynomial episodes and oracle calls, especially when feature dimension is constant.

Findings

01

Algorithm is polynomial in problem parameters.

02

Efficient implementation when feature dimension is constant.

03

Outperforms state-of-the-art methods in computational complexity.

Abstract

Designing sample-efficient and computationally feasible reinforcement learning (RL) algorithms is particularly challenging in environments with large or infinite state and action spaces. In this paper, we advance this effort by presenting an efficient algorithm for Markov Decision Processes (MDPs) where the state-action value function of any policy is linear in a given feature map. This challenging setting can model environments with infinite states and actions, strictly generalizes classic linear MDPs, and currently lacks a computationally efficient algorithm under online access to the MDP. Specifically, we introduce a new RL algorithm that efficiently finds a near-optimal policy in this setting, using a number of episodes and calls to a cost-sensitive classification (CSC) oracle that are both polynomial in the problem parameters. Notably, our CSC oracle can be efficiently implemented…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAuction Theory and Applications