Efficient Reinforcement Learning in Block MDPs: A Model-free   Representation Learning Approach

Xuezhou Zhang; Yuda Song; Masatoshi Uehara; Mengdi Wang; Alekh; Agarwal; Wen Sun

arXiv:2202.00063·cs.LG·October 12, 2022

Efficient Reinforcement Learning in Block MDPs: A Model-free Representation Learning Approach

Xuezhou Zhang, Yuda Song, Masatoshi Uehara, Mengdi Wang, Alekh, Agarwal, Wen Sun

PDF

Open Access 1 Repo

TL;DR

This paper introduces BRIEE, a model-free reinforcement learning algorithm that efficiently learns near-optimal policies in block-structured MDPs with rich observations, by interleaving state discovery, exploration, and exploitation.

Contribution

The paper proposes BRIEE, a novel algorithm that provably learns in block-structured MDPs with sample complexity independent of observation space size.

Findings

01

BRIEE outperforms HOMER and other baselines on complex exploration tasks.

02

BRIEE achieves polynomial sample complexity in latent states, actions, and horizon.

03

Empirical results demonstrate BRIEE's superior sample efficiency.

Abstract

We present BRIEE (Block-structured Representation learning with Interleaved Explore Exploit), an algorithm for efficient reinforcement learning in Markov Decision Processes with block-structured dynamics (i.e., Block MDPs), where rich observations are generated from a set of unknown latent states. BRIEE interleaves latent states discovery, exploration, and exploitation together, and can provably learn a near-optimal policy with sample complexity scaling polynomially in the number of latent states, actions, and the time horizon, with no dependence on the size of the potentially infinite observation space. Empirically, we show that BRIEE is more sample efficient than the state-of-art Block MDP algorithm HOMER and other empirical RL baselines on challenging rich-observation combination lock problems that require deep exploration.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

yudasong/briee
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Data Stream Mining Techniques