Interpretable performance analysis towards offline reinforcement   learning: A dataset perspective

Chenyang Xi; Bo Tang; Jiajun Shen; Xinfu Liu; Feiyu Xiong; Xueying Li

arXiv:2105.05473·cs.LG·May 13, 2021·1 cites

Interpretable performance analysis towards offline reinforcement learning: A dataset perspective

Chenyang Xi, Bo Tang, Jiajun Shen, Xinfu Liu, Feiyu Xiong, Xueying Li

PDF

Open Access

TL;DR

This paper introduces a new dataset perspective for analyzing offline reinforcement learning, proposing a taxonomy, theoretical bounds, a modified algorithm, and a comprehensive benchmark platform to advance the field.

Contribution

It offers a novel taxonomy, theoretical insights into extrapolation error, an improved algorithm, and an open-source benchmark platform for offline RL research.

Findings

01

BCQ outperforms other techniques under certain conditions

02

The modified top return selection improves performance on low-return datasets

03

The RLEG benchmark enables fair comparison of offline RL algorithms

Abstract

Offline reinforcement learning (RL) has increasingly become the focus of the artificial intelligent research due to its wide real-world applications where the collection of data may be difficult, time-consuming, or costly. In this paper, we first propose a two-fold taxonomy for existing offline RL algorithms from the perspective of exploration and exploitation tendency. Secondly, we derive the explicit expression of the upper bound of extrapolation error and explore the correlation between the performance of different types of algorithms and the distribution of actions under states. Specifically, we relax the strict assumption on the sufficiently large amount of state-action tuples. Accordingly, we provably explain why batch constrained Q-learning (BCQ) performs better than other existing techniques. Thirdly, after identifying the weakness of BCQ on dataset of low mean episode returns,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Smart Grid Energy Management · Data Stream Mining Techniques

MethodsQ-Learning