Mitigating Plasticity Loss in Continual Reinforcement Learning by Reducing Churn
Hongyao Tang, Johan Obando-Ceron, Pablo Samuel Castro, Aaron Courville, Glen Berseth

TL;DR
This paper investigates how reducing output variability or churn in deep continual reinforcement learning can mitigate plasticity loss, with a new method called C-CHAIN that improves learning across various benchmarks.
Contribution
The paper introduces C-CHAIN, a novel approach to reduce churn in continual RL, preventing rank collapse of the NTK and enhancing learning performance.
Findings
Reducing churn prevents rank collapse of the NTK.
C-CHAIN outperforms baselines across multiple benchmarks.
Churn reduction improves plasticity in continual RL.
Abstract
Plasticity, or the ability of an agent to adapt to new tasks, environments, or distributions, is crucial for continual learning. In this paper, we study the loss of plasticity in deep continual RL from the lens of churn: network output variability for out-of-batch data induced by mini-batch training. We demonstrate that (1) the loss of plasticity is accompanied by the exacerbation of churn due to the gradual rank decrease of the Neural Tangent Kernel (NTK) matrix; (2) reducing churn helps prevent rank collapse and adjusts the step size of regular RL gradients adaptively. Moreover, we introduce Continual Churn Approximated Reduction (C-CHAIN) and demonstrate it improves learning performance and outperforms baselines in a diverse range of continual learning environments on OpenAI Gym Control, ProcGen, DeepMind Control Suite, and MinAtar benchmarks.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsMuscle activation and electromyography studies
