Learning Causal State Representations of Partially Observable Environments
Amy Zhang, Zachary C. Lipton, Luis Pineda, Kamyar Azizzadenesheli,, Anima Anandkumar, Laurent Itti, Joelle Pineau, Tommaso Furlanello

TL;DR
This paper introduces an algorithm to learn causal state representations in POMDPs using RNNs, enabling more efficient policy learning in environments with complex observations, backed by theoretical guarantees and empirical validation.
Contribution
It proposes a novel method to approximate causal states with RNNs, connecting causal inference with reinforcement learning, and provides theoretical bounds on optimality.
Findings
Learned causal states improve policy learning efficiency.
Theoretical guarantees relate causal states to bisimulation.
Empirical results show superior performance over prior methods.
Abstract
Intelligent agents can cope with sensory-rich environments by learning task-agnostic state abstractions. In this paper, we propose an algorithm to approximate causal states, which are the coarsest partition of the joint history of actions and observations in partially-observable Markov decision processes (POMDP). Our method learns approximate causal state representations from RNNs trained to predict subsequent observations given the history. We demonstrate that these learned state representations are useful for learning policies efficiently in reinforcement learning problems with rich observation spaces. We connect causal states with causal feature sets from the causal inference literature, and also provide theoretical guarantees on the optimality of the continuous version of this causal state representation under Lipschitz assumptions by proving equivalence to bisimulation, a relation…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Adversarial Robustness in Machine Learning · Machine Learning and Algorithms
MethodsQ-Learning · Dense Connections · Convolution · Deep Q-Network
