Distributionally Robust Offline Reinforcement Learning with Linear   Function Approximation

Xiaoteng Ma; Zhipeng Liang; Jose Blanchet; Mingwen Liu; Li Xia; Jiheng; Zhang; Qianchuan Zhao; Zhengyuan Zhou

arXiv:2209.06620·cs.LG·January 30, 2023·1 cites

Distributionally Robust Offline Reinforcement Learning with Linear Function Approximation

Xiaoteng Ma, Zhipeng Liang, Jose Blanchet, Mingwen Liu, Li Xia, Jiheng, Zhang, Qianchuan Zhao, Zhengyuan Zhou

PDF

Open Access

TL;DR

This paper develops distributionally robust offline reinforcement learning algorithms with linear function approximation, providing the first non-asymptotic sample complexity results and demonstrating their effectiveness through experiments.

Contribution

It introduces two novel algorithms for distributionally robust offline RL with linear function approximation, achieving new theoretical error bounds.

Findings

01

Algorithms outperform non-robust baselines in experiments.

02

Provided the first non-asymptotic sample complexity bounds for this setting.

03

Demonstrated robustness benefits in diverse experimental scenarios.

Abstract

Among the reasons hindering reinforcement learning (RL) applications to real-world problems, two factors are critical: limited data and the mismatch between the testing environment (real environment in which the policy is deployed) and the training environment (e.g., a simulator). This paper attempts to address these issues simultaneously with distributionally robust offline RL, where we learn a distributionally robust policy using historical data obtained from the source environment by optimizing against a worst-case perturbation thereof. In particular, we move beyond tabular settings and consider linear function approximation. More specifically, we consider two settings, one where the dataset is well-explored and the other where the dataset has sufficient coverage of the optimal policy. We propose two algorithms~-- one for each of the two settings~-- that achieve error bounds…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Auction Theory and Applications · Machine Learning and Algorithms