Learning Causal State Representations of Partially Observable   Environments

Amy Zhang; Zachary C. Lipton; Luis Pineda; Kamyar Azizzadenesheli,; Anima Anandkumar; Laurent Itti; Joelle Pineau; Tommaso Furlanello

arXiv:1906.10437·cs.LG·February 9, 2021·30 cites

Learning Causal State Representations of Partially Observable Environments

Amy Zhang, Zachary C. Lipton, Luis Pineda, Kamyar Azizzadenesheli,, Anima Anandkumar, Laurent Itti, Joelle Pineau, Tommaso Furlanello

PDF

Open Access

TL;DR

This paper introduces an algorithm to learn causal state representations in POMDPs using RNNs, enabling more efficient policy learning in environments with complex observations, backed by theoretical guarantees and empirical validation.

Contribution

It proposes a novel method to approximate causal states with RNNs, connecting causal inference with reinforcement learning, and provides theoretical bounds on optimality.

Findings

01

Learned causal states improve policy learning efficiency.

02

Theoretical guarantees relate causal states to bisimulation.

03

Empirical results show superior performance over prior methods.

Abstract

Intelligent agents can cope with sensory-rich environments by learning task-agnostic state abstractions. In this paper, we propose an algorithm to approximate causal states, which are the coarsest partition of the joint history of actions and observations in partially-observable Markov decision processes (POMDP). Our method learns approximate causal state representations from RNNs trained to predict subsequent observations given the history. We demonstrate that these learned state representations are useful for learning policies efficiently in reinforcement learning problems with rich observation spaces. We connect causal states with causal feature sets from the causal inference literature, and also provide theoretical guarantees on the optimality of the continuous version of this causal state representation under Lipschitz assumptions by proving equivalence to bisimulation, a relation…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Adversarial Robustness in Machine Learning · Machine Learning and Algorithms

MethodsQ-Learning · Dense Connections · Convolution · Deep Q-Network