On the Possibility of Learning in Reactive Environments with Arbitrary   Dependence

Daniil Ryabko; Marcus Hutter

arXiv:0810.5636·cs.LG·December 30, 2009

On the Possibility of Learning in Reactive Environments with Arbitrary Dependence

Daniil Ryabko, Marcus Hutter

PDF

Open Access

TL;DR

This paper investigates reinforcement learning in highly general environments with arbitrary dependence, identifying conditions under which an agent can achieve optimal long-term rewards across a known class of such environments.

Contribution

It introduces sufficient conditions for learning in environments with arbitrary stochastic dependence, extending RL theory beyond traditional Markovian assumptions.

Findings

01

Identifies conditions enabling optimal reward attainment in complex environments.

02

Analyzes the relationship between these conditions and classical probabilistic assumptions.

03

Provides theoretical insights into learning in non-Markovian, dependent environments.

Abstract

We address the problem of reinforcement learning in which observations may exhibit an arbitrary form of stochastic dependence on past observations and actions, i.e. environments more general than (PO)MDPs. The task for an agent is to attain the best possible asymptotic reward where the true generating environment is unknown but belongs to a known countable family of environments. We find some sufficient conditions on the class of environments under which an agent exists which attains the best asymptotic reward for any environment in the class. We analyze how tight these conditions are and how they relate to different probabilistic assumptions known in reinforcement learning and related fields, such as Markov Decision Processes and mixing conditions.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Reinforcement Learning in Robotics · Machine Learning and Algorithms