Parseval Regularization for Continual Reinforcement Learning

Wesley Chung; Lynn Cherif; David Meger; Doina Precup

arXiv:2412.07224·cs.LG·December 11, 2024

Parseval Regularization for Continual Reinforcement Learning

Wesley Chung, Lynn Cherif, David Meger, Doina Precup

PDF

Open Access 1 Video

TL;DR

This paper introduces Parseval regularization to improve continual reinforcement learning by maintaining weight matrix orthogonality, leading to enhanced training stability and performance across various tasks.

Contribution

It proposes using Parseval regularization to address training challenges in continual RL, demonstrating its effectiveness through extensive experiments and analysis.

Findings

01

Improved RL agent performance on gridworld, CARL, and MetaWorld tasks

02

Enhanced training stability and plasticity retention

03

Insights into network trainability metrics like weight rank and entropy

Abstract

Loss of plasticity, trainability loss, and primacy bias have been identified as issues arising when training deep neural networks on sequences of tasks -- all referring to the increased difficulty in training on new tasks. We propose to use Parseval regularization, which maintains orthogonality of weight matrices, to preserve useful optimization properties and improve training in a continual reinforcement learning setting. We show that it provides significant benefits to RL agents on a suite of gridworld, CARL and MetaWorld tasks. We conduct comprehensive ablations to identify the source of its benefits and investigate the effect of certain metrics associated to network trainability including weight matrix rank, weight norms and policy entropy.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Parseval Regularization for Continual Reinforcement Learning· slideslive

Taxonomy

TopicsMachine Learning and ELM