DiffStitch: Boosting Offline Reinforcement Learning with Diffusion-based   Trajectory Stitching

Guanghe Li; Yixiang Shan; Zhengbang Zhu; Ting Long; Weinan Zhang

arXiv:2402.02439·cs.LG·February 23, 2024·1 cites

DiffStitch: Boosting Offline Reinforcement Learning with Diffusion-based Trajectory Stitching

Guanghe Li, Yixiang Shan, Zhengbang Zhu, Ting Long, Weinan Zhang

PDF

Open Access

TL;DR

DiffStitch introduces a diffusion-based data augmentation method that stitches trajectories in offline RL datasets, improving policy performance by connecting low- and high-reward trajectories across various RL algorithms.

Contribution

The paper presents a novel diffusion-based trajectory stitching technique that enhances offline RL datasets, enabling better policy learning by connecting trajectories more effectively.

Findings

01

Significant performance improvements across multiple RL algorithms.

02

Effective connection of low- and high-reward trajectories.

03

Validated on D4RL datasets with consistent gains.

Abstract

In offline reinforcement learning (RL), the performance of the learned policy highly depends on the quality of offline datasets. However, in many cases, the offline dataset contains very limited optimal trajectories, which poses a challenge for offline RL algorithms as agents must acquire the ability to transit to high-reward regions. To address this issue, we introduce Diffusion-based Trajectory Stitching (DiffStitch), a novel diffusion-based data augmentation pipeline that systematically generates stitching transitions between trajectories. DiffStitch effectively connects low-reward trajectories with high-reward trajectories, forming globally optimal trajectories to address the challenges faced by offline RL algorithms. Empirical experiments conducted on D4RL datasets demonstrate the effectiveness of DiffStitch across RL methodologies. Notably, DiffStitch demonstrates substantial…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRobot Manipulation and Learning · Reinforcement Learning in Robotics