Measuring Data Quality for Dataset Selection in Offline Reinforcement   Learning

Phillip Swazinna; Steffen Udluft; Thomas Runkler

arXiv:2111.13461·cs.LG·November 29, 2021

Measuring Data Quality for Dataset Selection in Offline Reinforcement Learning

Phillip Swazinna, Steffen Udluft, Thomas Runkler

PDF

Open Access

TL;DR

This paper introduces simple, effective indicators for selecting high-quality datasets in offline reinforcement learning, addressing a previously overlooked practical challenge.

Contribution

It proposes three straightforward dataset quality indicators—ERI, EAS, and COI—and demonstrates their effectiveness in guiding dataset selection.

Findings

01

ERI, EAS, and COI effectively identify promising datasets

02

Simple indicators outperform more complex methods in experiments

03

Dataset selection improves offline RL policy performance

Abstract

Recently developed offline reinforcement learning algorithms have made it possible to learn policies directly from pre-collected datasets, giving rise to a new dilemma for practitioners: Since the performance the algorithms are able to deliver depends greatly on the dataset that is presented to them, practitioners need to pick the right dataset among the available ones. This problem has so far not been discussed in the corresponding literature. We discuss ideas how to select promising datasets and propose three very simple indicators: Estimated relative return improvement (ERI) and estimated action stochasticity (EAS), as well as a combination of the two (COI), and empirically show that despite their simplicity they can be very effectively used for dataset selection.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics