Beyond Single-Model Optimization: Preserving Plasticity in Continual Reinforcement Learning
Lute Lillo, Nick Cheney

TL;DR
This paper introduces TeLAPA, a continual reinforcement learning framework that maintains diverse, skill-aligned policy neighborhoods in a shared latent space to enhance plasticity and transferability across tasks.
Contribution
TeLAPA shifts continual RL from single-model preservation to maintaining behaviorally diverse policy neighborhoods for better reuse and adaptation.
Findings
TeLAPA learns more tasks successfully in MiniGrid CL.
It recovers competence faster after interference.
It retains higher performance across task sequences.
Abstract
Continual reinforcement learning must balance retention with adaptation, yet many methods still rely on \emph{single-model preservation}, committing to one evolving policy as the main reusable solution across tasks. Even when a previously successful policy is retained, it may no longer provide a reliable starting point for rapid adaptation after interference, reflecting a form of \emph{loss of plasticity} that single-policy preservation cannot address. Inspired by quality-diversity methods, we introduce \textsc{TeLAPA} (Transfer-Enabled Latent-Aligned Policy Archives), a continual RL framework that organizes behaviorally diverse policy neighborhoods into per-task archives and maintains a shared latent space so that archived policies remain comparable and reusable under non-stationary drift. This perspective shifts continual RL from retaining isolated solutions to maintaining…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
