Reward-Free Policy Space Compression for Reinforcement Learning

Mirco Mutti; Stefano Del Col; Marcello Restelli

arXiv:2202.11079·cs.LG·February 23, 2022

Reward-Free Policy Space Compression for Reinforcement Learning

Mirco Mutti, Stefano Del Col, Marcello Restelli

PDF

Open Access

TL;DR

This paper introduces a method to compress the vast policy space in reinforcement learning into a finite set of representative policies, reducing complexity while maintaining performance, through a game-theoretic approach.

Contribution

It formulates policy space compression as a set cover problem and proposes an efficient game-theoret solution for reward-free policy compression.

Findings

01

Effective policy space compression demonstrated in simple domains

02

Reduces sample and computation inefficiencies in reinforcement learning

03

Provides a theoretical foundation for policy set approximation

Abstract

In reinforcement learning, we encode the potential behaviors of an agent interacting with an environment into an infinite set of policies, the policy space, typically represented by a family of parametric functions. Dealing with such a policy space is a hefty challenge, which often causes sample and computation inefficiencies. However, we argue that a limited number of policies are actually relevant when we also account for the structure of the environment and of the policy parameterization, as many of them would induce very similar interactions, i.e., state-action distributions. In this paper, we seek for a reward-free compression of the policy space into a finite set of representative policies, such that, given any policy $π$ , the minimum R\'enyi divergence between the state-action distributions of the representative policies and the state-action distribution of $π$ is bounded. We…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Reinforcement Learning in Robotics · Formal Methods in Verification