Statistical Analysis of Policy Space Compression Problem
Majid Molaei, Marcello Restelli, Alberto Maria Metelli, Matteo Papini

TL;DR
This paper investigates the sample complexity of policy space compression in reinforcement learning, using divergence measures to establish bounds for effective policy approximation.
Contribution
It introduces a method to determine sample size requirements for policy compression using Rényi divergence and $l_1$ norm, applicable to both model-based and model-free settings.
Findings
Derived error bounds for policy approximation accuracy.
Established sample size requirements for policy compression.
Linked divergence measures to policy space geometry.
Abstract
Policy search methods are crucial in reinforcement learning, offering a framework to address continuous state-action and partially observable problems. However, the complexity of exploring vast policy spaces can lead to significant inefficiencies. Reducing the policy space through policy compression emerges as a powerful, reward-free approach to accelerate the learning process. This technique condenses the policy space into a smaller, representative set while maintaining most of the original effectiveness. Our research focuses on determining the necessary sample size to learn this compressed set accurately. We employ R\'enyi divergence to measure the similarity between true and estimated policy distributions, establishing error bounds for good approximations. To simplify the analysis, we employ the norm, determining sample size requirements for both model-based and model-free…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDistributed and Parallel Computing Systems
MethodsSparse Evolutionary Training
