# Universal Reinforcement Learning Algorithms: Survey and Experiments

**Authors:** John Aslanides, Jan Leike, Marcus Hutter

arXiv: 1705.10557 · 2017-05-31

## TL;DR

This paper surveys universal reinforcement learning algorithms that operate with minimal assumptions about the environment, providing a unified framework, experimental insights, and an open-source implementation to advance understanding and testing.

## Contribution

It offers the first empirical investigation of universal RL algorithms, unifies various approaches under a common framework, and provides open-source tools for further research.

## Key findings

- Universal RL algorithms exhibit diverse policy behaviors.
- Performance varies significantly in partially observable environments.
- Open-source implementation facilitates future experimentation.

## Abstract

Many state-of-the-art reinforcement learning (RL) algorithms typically assume that the environment is an ergodic Markov Decision Process (MDP). In contrast, the field of universal reinforcement learning (URL) is concerned with algorithms that make as few assumptions as possible about the environment. The universal Bayesian agent AIXI and a family of related URL algorithms have been developed in this setting. While numerous theoretical optimality results have been proven for these agents, there has been no empirical investigation of their behavior to date. We present a short and accessible survey of these URL algorithms under a unified notation and framework, along with results of some experiments that qualitatively illustrate some properties of the resulting policies, and their relative performance on partially-observable gridworld environments. We also present an open-source reference implementation of the algorithms which we hope will facilitate further understanding of, and experimentation with, these ideas.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1705.10557/full.md

## Figures

18 figures with captions in the complete paper: https://tomesphere.com/paper/1705.10557/full.md

## References

29 references — full list in the complete paper: https://tomesphere.com/paper/1705.10557/full.md

---
Source: https://tomesphere.com/paper/1705.10557