Improved Sample Complexity Bounds for Distributionally Robust   Reinforcement Learning

Zaiyan Xu; Kishan Panaganti; Dileep Kalathil

arXiv:2303.02783·cs.LG·May 23, 2023·1 cites

Improved Sample Complexity Bounds for Distributionally Robust Reinforcement Learning

Zaiyan Xu, Kishan Panaganti, Dileep Kalathil

PDF

Open Access 1 Repo

TL;DR

This paper introduces a new algorithm for distributionally robust reinforcement learning that significantly improves sample complexity bounds across various divergence-based uncertainty sets, including the first for Wasserstein.

Contribution

The paper proposes the Robust Phased Value Learning algorithm with improved sample complexity bounds for multiple divergence measures, including the first analysis for Wasserstein uncertainty sets.

Findings

01

Achieves $ ilde{O}(|S||A| H^{5})$ sample complexity, better by a factor of $|S|$.

02

Provides the first sample complexity result for Wasserstein uncertainty sets.

03

Demonstrates effectiveness through simulation experiments.

Abstract

We consider the problem of learning a control policy that is robust against the parameter mismatches between the training environment and testing environment. We formulate this as a distributionally robust reinforcement learning (DR-RL) problem where the objective is to learn the policy which maximizes the value function against the worst possible stochastic model of the environment in an uncertainty set. We focus on the tabular episodic learning setting where the algorithm has access to a generative model of the nominal (training) environment around which the uncertainty set is defined. We propose the Robust Phased Value Learning (RPVL) algorithm to solve this problem for the uncertainty sets specified by four different divergences: total variation, chi-square, Kullback-Leibler, and Wasserstein. We show that our algorithm achieves $\tilde{O} (∣ S ∣∣ A ∣ H^{5})$ …

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

zaiyan-x/RPVL
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics