Measuring Data Quality for Dataset Selection in Offline Reinforcement Learning
Phillip Swazinna, Steffen Udluft, Thomas Runkler

TL;DR
This paper introduces simple, effective indicators for selecting high-quality datasets in offline reinforcement learning, addressing a previously overlooked practical challenge.
Contribution
It proposes three straightforward dataset quality indicators—ERI, EAS, and COI—and demonstrates their effectiveness in guiding dataset selection.
Findings
ERI, EAS, and COI effectively identify promising datasets
Simple indicators outperform more complex methods in experiments
Dataset selection improves offline RL policy performance
Abstract
Recently developed offline reinforcement learning algorithms have made it possible to learn policies directly from pre-collected datasets, giving rise to a new dilemma for practitioners: Since the performance the algorithms are able to deliver depends greatly on the dataset that is presented to them, practitioners need to pick the right dataset among the available ones. This problem has so far not been discussed in the corresponding literature. We discuss ideas how to select promising datasets and propose three very simple indicators: Estimated relative return improvement (ERI) and estimated action stochasticity (EAS), as well as a combination of the two (COI), and empirically show that despite their simplicity they can be very effectively used for dataset selection.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics
