Data Valuation for Offline Reinforcement Learning

Amir Abolfazli; Gregory Palmer; Daniel Kudenko

arXiv:2205.09550·cs.LG·May 20, 2022

Data Valuation for Offline Reinforcement Learning

Amir Abolfazli, Gregory Palmer, Daniel Kudenko

PDF

Open Access

TL;DR

This paper investigates the challenges of data transferability in offline reinforcement learning and introduces DVORL, a data valuation method that enhances policy performance and robustness when using externally acquired data.

Contribution

The paper proposes DVORL, a novel data valuation approach that improves offline RL transferability and performance on external datasets.

Findings

01

Current offline RL algorithms underperform with source-target domain mismatch.

02

DVORL identifies high-quality, relevant data to improve policy learning.

03

DVORL outperforms baseline methods on MuJoCo environments.

Abstract

The success of deep reinforcement learning (DRL) hinges on the availability of training data, which is typically obtained via a large number of environment interactions. In many real-world scenarios, costs and risks are associated with gathering these data. The field of offline reinforcement learning addresses these issues through outsourcing the collection of data to a domain expert or a carefully monitored program and subsequently searching for a batch-constrained optimal policy. With the emergence of data markets, an alternative to constructing a dataset in-house is to purchase external data. However, while state-of-the-art offline reinforcement learning approaches have shown a lot of promise, they currently rely on carefully constructed datasets that are well aligned with the intended target domains. This raises questions regarding the transferability and robustness of an offline…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Mobile Crowdsensing and Crowdsourcing · Machine Learning and Data Classification