Deep Active Inference for Partially Observable MDPs

Otto van der Himst; Pablo Lanillos

arXiv:2009.03622·cs.LG·February 8, 2021

Deep Active Inference for Partially Observable MDPs

Otto van der Himst, Pablo Lanillos

PDF

1 Repo

TL;DR

This paper introduces a deep active inference model capable of learning effective policies directly from high-dimensional sensory inputs in partially observable environments, outperforming some existing reinforcement learning methods.

Contribution

The paper presents a novel deep active inference framework that handles partial observability using variational autoencoders, extending previous models limited to fully observable domains.

Findings

01

Achieves comparable or superior performance to deep Q-learning on OpenAI benchmarks.

02

Successfully learns policies directly from high-dimensional sensory data.

03

Demonstrates the applicability of active inference in complex, partially observable settings.

Abstract

Deep active inference has been proposed as a scalable approach to perception and action that deals with large policy and state spaces. However, current models are limited to fully observable domains. In this paper, we describe a deep active inference model that can learn successful policies directly from high-dimensional sensory inputs. The deep learning architecture optimizes a variant of the expected free energy and encodes the continuous state representation by means of a variational autoencoder. We show, in the OpenAI benchmark, that our approach has comparable or better performance than deep Q-learning, a state-of-the-art deep reinforcement learning algorithm.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Grottoh/Deep-Active-Inference-for-Partially-Observable-MDPs
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.