Provably efficient RL with Rich Observations via Latent State Decoding

Simon S. Du; Akshay Krishnamurthy; Nan Jiang; Alekh Agarwal; Miroslav; Dud\'ik; John Langford

arXiv:1901.09018·cs.LG·September 10, 2021·58 cites

Provably efficient RL with Rich Observations via Latent State Decoding

Simon S. Du, Akshay Krishnamurthy, Nan Jiang, Alekh Agarwal, Miroslav, Dud\'ik, John Langford

PDF

Open Access 1 Repo

TL;DR

This paper introduces a method for efficient reinforcement learning in environments with rich observations by decoding latent states, providing theoretical guarantees and empirical improvements over traditional Q-learning.

Contribution

It proposes a novel latent state decoding approach with finite-sample guarantees, significantly enhancing exploration efficiency in complex MDPs.

Findings

01

Method exponentially outperforms naive Q-learning in exploration tasks.

02

Finite-sample guarantees for decoding and policy quality.

03

Empirical validation on challenging exploration problems.

Abstract

We study the exploration problem in episodic MDPs with rich observations generated from a small number of latent states. Under certain identifiability assumptions, we demonstrate how to estimate a mapping from the observations to latent states inductively through a sequence of regression and clustering steps -- where previously decoded latent states provide labels for later regression problems -- and use it to construct good exploration policies. We provide finite-sample guarantees on the quality of the learned state decoding function and exploration policies, and complement our theory with an empirical evaluation on a class of hard exploration problems. Our method exponentially improves over $Q$ -learning with na\"ive exploration, even when $Q$ -learning has cheating access to latent states.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Microsoft/StateDecoding
torchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Algorithms · Data Stream Mining Techniques · Adversarial Robustness in Machine Learning