Efficient Model-Free Exploration in Low-Rank MDPs

Zakaria Mhammedi; Adam Block; Dylan J. Foster; Alexander Rakhlin

arXiv:2307.03997·cs.LG·March 1, 2024

Efficient Model-Free Exploration in Low-Rank MDPs

Zakaria Mhammedi, Adam Block, Dylan J. Foster, Alexander Rakhlin

PDF

Open Access

TL;DR

This paper introduces VoX, a novel, computationally efficient, model-free exploration algorithm for low-rank MDPs that leverages barycentric spanners for effective exploration without restrictive assumptions.

Contribution

It presents the first provably sample-efficient, model-free algorithm for exploration in low-rank MDPs that works with general function approximation and no extra structural assumptions.

Findings

01

VoX achieves sample-efficient exploration in low-rank MDPs.

02

The algorithm is computationally efficient and does not rely on restrictive assumptions.

03

The analysis introduces new techniques for error-tolerant barycentric spanner computation.

Abstract

A major challenge in reinforcement learning is to develop practical, sample-efficient algorithms for exploration in high-dimensional domains where generalization and function approximation is required. Low-Rank Markov Decision Processes -- where transition probabilities admit a low-rank factorization based on an unknown feature embedding -- offer a simple, yet expressive framework for RL with function approximation, but existing algorithms are either (1) computationally intractable, or (2) reliant upon restrictive statistical assumptions such as latent variable structure, access to model-based function approximation, or reachability. In this work, we propose the first provably sample-efficient algorithm for exploration in Low-Rank MDPs that is both computationally efficient and model-free, allowing for general function approximation and requiring no additional structural assumptions.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Adversarial Robustness in Machine Learning · Machine Learning and Algorithms