Reinforcement Learning with Algorithms from Probabilistic Structure   Estimation

Jonathan P. Epperlein; Roman Overko; Sergiy Zhuk; Christopher King,; Djallel Bouneffouf; Andrew Cullen; Robert Shorten

arXiv:2103.08241·cs.LG·June 2, 2022

Reinforcement Learning with Algorithms from Probabilistic Structure Estimation

Jonathan P. Epperlein, Roman Overko, Sergiy Zhuk, Christopher King,, Djallel Bouneffouf, Andrew Cullen, Robert Shorten

PDF

Open Access 1 Repo

TL;DR

This paper introduces a probabilistic structure estimation method for reinforcement learning that adaptively chooses between simple and complex algorithms based on environment impact, improving decision-making in uncertain settings.

Contribution

It proposes a likelihood-ratio test-based framework to automatically select the appropriate RL algorithm without prior environment assumptions.

Findings

01

The framework can effectively distinguish when myopic policies are optimal.

02

The proposed method provides a bound on regret in adaptive RL settings.

03

Simulations validate the approach in real-world scenarios.

Abstract

Reinforcement learning (RL) algorithms aim to learn optimal decisions in unknown environments through experience of taking actions and observing the rewards gained. In some cases, the environment is not influenced by the actions of the RL agent, in which case the problem can be modeled as a contextual multi-armed bandit and lightweight myopic algorithms can be employed. On the other hand, when the RL agent's actions affect the environment, the problem must be modeled as a Markov decision process and more complex RL algorithms are required which take the future effects of actions into account. Moreover, in practice, it is often unknown from the outset whether or not the agent's actions will impact the environment and it is therefore not possible to determine which RL algorithm is most fitting. In this work, we propose to avoid this difficult decision entirely and incorporate a choice…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

roman1e2f5p8s/rlapseingym
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Data Stream Mining Techniques · Smart Grid Energy Management