D-SPEAR: Dual-Stream Prioritized Experience Adaptive Replay for Stable Reinforcement Learning in Robotic Manipulation

Yu Zhang; Karl Mason

arXiv:2603.27346·cs.RO·April 3, 2026

D-SPEAR: Dual-Stream Prioritized Experience Adaptive Replay for Stable Reinforcement Learning in Robotic Manipulation

Yu Zhang, Karl Mason

PDF

TL;DR

D-SPEAR introduces a dual-stream replay framework that decouples actor and critic sampling, enhancing stability and performance in robotic manipulation reinforcement learning tasks.

Contribution

It proposes a novel adaptive replay mechanism with separate actor and critic streams, improving training stability and effectiveness over existing methods.

Findings

01

D-SPEAR outperforms SAC, TD3, and DDPG on Robosuite tasks.

02

The adaptive anchor balances sampling strategies effectively.

03

Ablation studies confirm the benefits of dual-stream replay.

Abstract

Robotic manipulation remains challenging for reinforcement learning due to contact-rich dynamics, long horizons, and training instability. Although off-policy actor-critic algorithms such as SAC and TD3 perform well in simulation, they often suffer from policy oscillations and performance collapse in realistic settings, partly due to experience replay strategies that ignore the differing data requirements of the actor and the critic. We propose D-SPEAR: Dual-Stream Prioritized Experience Adaptive Replay, a replay framework that decouples actor and critic sampling while maintaining a shared replay buffer. The critic leverages prioritized replay for efficient value learning, whereas the actor is updated using low-error transitions to stabilize policy optimization. An adaptive anchor mechanism balances uniform and prioritized sampling based on the coefficient of variation of TD errors, and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.