Prioritized Trajectory Replay: A Replay Memory for Data-driven   Reinforcement Learning

Jinyi Liu; Yi Ma; Jianye Hao; Yujing Hu; Yan Zheng; Tangjie Lv,; Changjie Fan

arXiv:2306.15503·cs.LG·March 24, 2025

Prioritized Trajectory Replay: A Replay Memory for Data-driven Reinforcement Learning

Jinyi Liu, Yi Ma, Jianye Hao, Yujing Hu, Yan Zheng, Tangjie Lv,, Changjie Fan

PDF

Open Access

TL;DR

This paper introduces a trajectory-based replay memory technique, Prioritized Trajectory Replay, that improves data sampling efficiency and performance in offline reinforcement learning by leveraging trajectory information and prioritized sampling.

Contribution

It proposes a novel trajectory sampling method, extending replay memory to trajectories, and introduces prioritized sampling to enhance offline RL performance.

Findings

01

Improved offline RL performance on D4RL benchmarks.

02

Enhanced data efficiency through backward trajectory sampling.

03

Effective avoidance of unseen actions during training.

Abstract

In recent years, data-driven reinforcement learning (RL), also known as offline RL, have gained significant attention. However, the role of data sampling techniques in offline RL has been overlooked despite its potential to enhance online RL performance. Recent research suggests applying sampling techniques directly to state-transitions does not consistently improve performance in offline RL. Therefore, in this study, we propose a memory technique, (Prioritized) Trajectory Replay (TR/PTR), which extends the sampling perspective to trajectories for more comprehensive information extraction from limited data. TR enhances learning efficiency by backward sampling of trajectories that optimizes the use of subsequent state information. Building on TR, we build the weighted critic target to avoid sampling unseen actions in offline training, and Prioritized Trajectory Replay (PTR) that enables…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics