Stable Continual Reinforcement Learning via Diffusion-based Trajectory Replay
Feng Chen, Fuguang Han, Cong Guan, Lei Yuan, Zhilong Zhang, Yang Yu,, Zongzhang Zhang

TL;DR
This paper introduces DISTR, a diffusion-based trajectory replay method for continual reinforcement learning, which effectively mitigates catastrophic forgetting by memorizing high-return trajectories and prioritizing pivotal tasks, outperforming existing methods.
Contribution
The paper proposes a novel diffusion model-based replay mechanism for continual RL, addressing generative replay limitations and enhancing stability and plasticity in learning multiple tasks.
Findings
DISTR outperforms existing continual RL baselines on the Continual World benchmark.
The diffusion-based replay effectively preserves task knowledge and improves success rates.
Prioritization of pivotal tasks enhances learning efficiency and stability.
Abstract
Given the inherent non-stationarity prevalent in real-world applications, continual Reinforcement Learning (RL) aims to equip the agent with the capability to address a series of sequentially presented decision-making tasks. Within this problem setting, a pivotal challenge revolves around \textit{catastrophic forgetting} issue, wherein the agent is prone to effortlessly erode the decisional knowledge associated with past encountered tasks when learning the new one. In recent progresses, the \textit{generative replay} methods have showcased substantial potential by employing generative models to replay data distribution of past tasks. Compared to storing the data from past tasks directly, this category of methods circumvents the growing storage overhead and possible data privacy concerns. However, constrained by the expressive capacity of generative models, existing \textit{generative…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTraffic control and management
MethodsDiffusion
