Swapped goal-conditioned offline reinforcement learning

Wenyan Yang; Huiling Wang; Dingding Cai; Joni Pajarinen; Joni-Kristen; K\"am\"ar\"ainen

arXiv:2302.08865·cs.LG·February 20, 2023·1 cites

Swapped goal-conditioned offline reinforcement learning

Wenyan Yang, Huiling Wang, Dingding Cai, Joni Pajarinen, Joni-Kristen, K\"am\"ar\"ainen

PDF

Open Access 1 Repo

TL;DR

This paper introduces a goal-swapping technique and a new offline RL method, DQAPG, to improve generalization and performance in goal-conditioned tasks, especially in complex manipulation scenarios.

Contribution

The paper proposes a novel goal-swapping data augmentation method and the DQAPG algorithm, enhancing offline GCRL performance and robustness against noise and extrapolation errors.

Findings

01

DQAPG outperforms state-of-the-art methods on benchmark tasks.

02

Goal-swapping improves test results in goal-conditioned offline RL.

03

The method achieves success on complex in-hand manipulation tasks.

Abstract

Offline goal-conditioned reinforcement learning (GCRL) can be challenging due to overfitting to the given dataset. To generalize agents' skills outside the given dataset, we propose a goal-swapping procedure that generates additional trajectories. To alleviate the problem of noise and extrapolation errors, we present a general offline reinforcement learning method called deterministic Q-advantage policy gradient (DQAPG). In the experiments, DQAPG outperforms state-of-the-art goal-conditioned offline RL methods in a wide range of benchmark tasks, and goal-swapping further improves the test results. It is noteworthy, that the proposed method obtains good performance on the challenging dexterous in-hand manipulation tasks for which the prior methods failed.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

jasonma2016/gofar
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Robot Manipulation and Learning · Machine Learning and Data Classification

MethodsTest