Learning Markov State Abstractions for Deep Reinforcement Learning
Cameron Allen, Neev Parikh, Omer Gottesman, George Konidaris

TL;DR
This paper introduces a new method for learning Markovian state abstractions in deep reinforcement learning, improving sample efficiency and performance without relying solely on reward signals.
Contribution
It proposes a novel set of conditions and a practical training procedure combining inverse modeling and contrastive learning for Markov abstraction in RL.
Findings
Learned representations capture domain structure effectively.
Achieves better sample efficiency than existing methods.
Matches or exceeds performance of hand-designed states.
Abstract
A fundamental assumption of reinforcement learning in Markov decision processes (MDPs) is that the relevant decision process is, in fact, Markov. However, when MDPs have rich observations, agents typically learn by way of an abstract state representation, and such representations are not guaranteed to preserve the Markov property. We introduce a novel set of conditions and prove that they are sufficient for learning a Markov abstract state representation. We then describe a practical training procedure that combines inverse model estimation and temporal contrastive learning to learn an abstraction that approximately satisfies these conditions. Our novel training objective is compatible with both online and offline training: it does not require a reward signal, but agents can capitalize on reward information when available. We empirically evaluate our approach on a visual gridworld…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsReinforcement Learning in Robotics · Domain Adaptation and Few-Shot Learning · Human Pose and Action Recognition
MethodsContrastive Learning
