Information-Theoretic Minimax Regret Bounds for Reinforcement Learning   based on Duality

Raghav Bongole; Amaury Gouverneur; Borja Rodr\'iguez-G\'alvez; Tobias; J. Oechtering; and Mikael Skoglund

arXiv:2410.16013·cs.LG·October 22, 2024

Information-Theoretic Minimax Regret Bounds for Reinforcement Learning based on Duality

Raghav Bongole, Amaury Gouverneur, Borja Rodr\'iguez-G\'alvez, Tobias, J. Oechtering, and Mikael Skoglund

PDF

Open Access

TL;DR

This paper derives information-theoretic minimax regret bounds for reinforcement learning in finite-horizon MDPs, providing new theoretical insights into robust policy performance across unknown environments.

Contribution

It introduces a novel minimax regret framework for MDPs and establishes bounds using information-theoretic and Bayesian regret analysis.

Findings

01

Derived minimax regret bounds for finite-horizon MDPs

02

Established minimax theorems linking Bayesian and minimax regret

03

Applied bounds to various reinforcement learning scenarios

Abstract

We study agents acting in an unknown environment where the agent's goal is to find a robust policy. We consider robust policies as policies that achieve high cumulative rewards for all possible environments. To this end, we consider agents minimizing the maximum regret over different environment parameters, leading to the study of minimax regret. This research focuses on deriving information-theoretic bounds for minimax regret in Markov Decision Processes (MDPs) with a finite time horizon. Building on concepts from supervised learning, such as minimum excess risk (MER) and minimax excess risk, we use recent bounds on the Bayesian regret to derive minimax regret bounds. Specifically, we establish minimax theorems and use bounds on the Bayesian regret to perform minimax regret analysis using these minimax theorems. Our contributions include defining a suitable minimax regret in the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications · Reinforcement Learning in Robotics