Distributionally Robust Offline Reinforcement Learning with Linear Function Approximation
Xiaoteng Ma, Zhipeng Liang, Jose Blanchet, Mingwen Liu, Li Xia, Jiheng, Zhang, Qianchuan Zhao, Zhengyuan Zhou

TL;DR
This paper develops distributionally robust offline reinforcement learning algorithms with linear function approximation, providing the first non-asymptotic sample complexity results and demonstrating their effectiveness through experiments.
Contribution
It introduces two novel algorithms for distributionally robust offline RL with linear function approximation, achieving new theoretical error bounds.
Findings
Algorithms outperform non-robust baselines in experiments.
Provided the first non-asymptotic sample complexity bounds for this setting.
Demonstrated robustness benefits in diverse experimental scenarios.
Abstract
Among the reasons hindering reinforcement learning (RL) applications to real-world problems, two factors are critical: limited data and the mismatch between the testing environment (real environment in which the policy is deployed) and the training environment (e.g., a simulator). This paper attempts to address these issues simultaneously with distributionally robust offline RL, where we learn a distributionally robust policy using historical data obtained from the source environment by optimizing against a worst-case perturbation thereof. In particular, we move beyond tabular settings and consider linear function approximation. More specifically, we consider two settings, one where the dataset is well-explored and the other where the dataset has sufficient coverage of the optimal policy. We propose two algorithms~-- one for each of the two settings~-- that achieve error bounds…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Auction Theory and Applications · Machine Learning and Algorithms
