FLAMBE: Structural Complexity and Representation Learning of Low Rank MDPs
Alekh Agarwal, Sham Kakade, Akshay Krishnamurthy, Wen Sun

TL;DR
This paper introduces FLAMBE, an algorithm for efficient reinforcement learning in low rank MDPs, by leveraging a novel connection to non-linear matrix decomposition and latent variable models for representation learning.
Contribution
It establishes a theoretical link between low rank MDPs and latent variable models, and develops FLAMBE for exploration and representation learning in such settings.
Findings
FLAMBE achieves provably efficient RL in low rank transition models.
The work connects low rank MDPs with non-linear matrix decomposition and latent variable models.
The approach generalizes prior representation learning methods in RL.
Abstract
In order to deal with the curse of dimensionality in reinforcement learning (RL), it is common practice to make parametric assumptions where values or policies are functions of some low dimensional feature space. This work focuses on the representation learning question: how can we learn such features? Under the assumption that the underlying (unknown) dynamics correspond to a low rank transition matrix, we show how the representation learning question is related to a particular non-linear matrix decomposition problem. Structurally, we make precise connections between these low rank MDPs and latent variable models, showing how they significantly generalize prior formulations for representation learning in RL. Algorithmically, we develop FLAMBE, which engages in exploration and representation learning for provably efficient RL in low rank transition models.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Reinforcement Learning in Robotics · Model Reduction and Neural Networks
