Exploring and Learning in Sparse Linear MDPs without Computationally   Intractable Oracles

Noah Golowich; Ankur Moitra; Dhruv Rohatgi

arXiv:2309.09457·cs.LG·September 20, 2023

Exploring and Learning in Sparse Linear MDPs without Computationally Intractable Oracles

Noah Golowich, Ankur Moitra, Dhruv Rohatgi

PDF

Open Access

TL;DR

This paper introduces a polynomial-time algorithm for learning in sparse linear MDPs without relying on intractable oracles, advancing feature selection and efficient policy learning in high-dimensional settings.

Contribution

It presents the first polynomial-time algorithm for sparse linear MDPs, utilizing the concept of emulators for efficient Bellman backup computations.

Findings

01

Existence of efficiently computable emulators for transition representations.

02

Algorithm achieves near-optimal policy with polynomial interactions in sparse settings.

03

Extension to block MDPs with low-depth decision trees, enabling sample-efficient learning.

Abstract

The key assumption underlying linear Markov Decision Processes (MDPs) is that the learner has access to a known feature map $ϕ (x, a)$ that maps state-action pairs to $d$ -dimensional vectors, and that the rewards and transitions are linear functions in this representation. But where do these features come from? In the absence of expert domain knowledge, a tempting strategy is to use the ``kitchen sink" approach and hope that the true features are included in a much larger set of potential features. In this paper we revisit linear MDPs from the perspective of feature selection. In a $k$ -sparse linear MDP, there is an unknown subset $S \subset [d]$ of size $k$ containing all the relevant features, and the goal is to learn a near-optimal policy in only poly $(k, lo g d)$ interactions with the environment. Our main result is the first polynomial-time algorithm for this problem. In contrast,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Algorithms · Reinforcement Learning in Robotics · Receptor Mechanisms and Signaling