FLAMBE: Structural Complexity and Representation Learning of Low Rank   MDPs

Alekh Agarwal; Sham Kakade; Akshay Krishnamurthy; Wen Sun

arXiv:2006.10814·cs.LG·July 23, 2020·36 cites

FLAMBE: Structural Complexity and Representation Learning of Low Rank MDPs

Alekh Agarwal, Sham Kakade, Akshay Krishnamurthy, Wen Sun

PDF

Open Access 1 Video

TL;DR

This paper introduces FLAMBE, an algorithm for efficient reinforcement learning in low rank MDPs, by leveraging a novel connection to non-linear matrix decomposition and latent variable models for representation learning.

Contribution

It establishes a theoretical link between low rank MDPs and latent variable models, and develops FLAMBE for exploration and representation learning in such settings.

Findings

01

FLAMBE achieves provably efficient RL in low rank transition models.

02

The work connects low rank MDPs with non-linear matrix decomposition and latent variable models.

03

The approach generalizes prior representation learning methods in RL.

Abstract

In order to deal with the curse of dimensionality in reinforcement learning (RL), it is common practice to make parametric assumptions where values or policies are functions of some low dimensional feature space. This work focuses on the representation learning question: how can we learn such features? Under the assumption that the underlying (unknown) dynamics correspond to a low rank transition matrix, we show how the representation learning question is related to a particular non-linear matrix decomposition problem. Structurally, we make precise connections between these low rank MDPs and latent variable models, showing how they significantly generalize prior formulations for representation learning in RL. Algorithmically, we develop FLAMBE, which engages in exploration and representation learning for provably efficient RL in low rank transition models.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

FLAMBE: Structural Complexity and Representation Learning of Low Rank MDPs· slideslive

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Reinforcement Learning in Robotics · Model Reduction and Neural Networks