Model-Free Robust Reinforcement Learning with Sample Complexity Analysis

Yudan Wang; Shaofeng Zou; Yue Wang

arXiv:2406.17096·cs.LG·June 26, 2024

Model-Free Robust Reinforcement Learning with Sample Complexity Analysis

Yudan Wang, Shaofeng Zou, Yue Wang

PDF

Open Access

TL;DR

This paper introduces a novel model-free distributionally robust reinforcement learning algorithm using Multi-level Monte Carlo, achieving finite sample complexity guarantees across multiple divergence-based uncertainty sets, advancing practical robustness.

Contribution

It presents the first model-free DR-RL algorithms with finite sample guarantees for total variation and Chi-square divergence, and improves sample complexity for KL divergence, broadening applicability.

Findings

01

First model-free DR-RL with finite sample guarantees for total variation and Chi-square divergence.

02

Improved sample complexity for KL divergence-based DR-RL.

03

Achieves the tightest complexity bounds for all three uncertainty models.

Abstract

Distributionally Robust Reinforcement Learning (DR-RL) aims to derive a policy optimizing the worst-case performance within a predefined uncertainty set. Despite extensive research, previous DR-RL algorithms have predominantly favored model-based approaches, with limited availability of model-free methods offering convergence guarantees or sample complexities. This paper proposes a model-free DR-RL algorithm leveraging the Multi-level Monte Carlo (MLMC) technique to close such a gap. Our innovative approach integrates a threshold mechanism that ensures finite sample requirements for algorithmic implementation, a significant improvement than previous model-free algorithms. We develop algorithms for uncertainty sets defined by total variation, Chi-square divergence, and KL divergence, and provide finite sample analyses under all three cases. Remarkably, our algorithms represent the first…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics