Bridging the Gap Between Target Networks and Functional Regularization

Alexandre Piche; Valentin Thomas; Joseph Marino; Rafael; Pardinas; Gian Maria Marconi; Christopher Pal; Mohammad Emtiyaz Khan

arXiv:2210.12282·cs.LG·January 4, 2024

Bridging the Gap Between Target Networks and Functional Regularization

Alexandre Piche, Valentin Thomas, Joseph Marino, Rafael, Pardinas, Gian Maria Marconi, Christopher Pal, Mohammad Emtiyaz Khan

PDF

Open Access

TL;DR

This paper reveals that Target Networks in Deep Reinforcement Learning act as implicit regularizers and proposes an explicit, convex Functional Regularization to improve training stability, efficiency, and performance.

Contribution

It introduces a novel explicit Functional Regularization method as a more flexible and theoretically grounded alternative to Target Networks.

Findings

01

Functional Regularization improves sample efficiency.

02

Replacing Target Networks enhances performance.

03

Theoretical analysis confirms convergence benefits.

Abstract

Bootstrapping is behind much of the successes of Deep Reinforcement Learning. However, learning the value function via bootstrapping often leads to unstable training due to fast-changing target values. Target Networks are employed to stabilize training by using an additional set of lagging parameters to estimate the target values. Despite the popularity of Target Networks, their effect on the optimization is still misunderstood. In this work, we show that they act as an implicit regularizer. This regularizer has disadvantages such as being inflexible and non convex. To overcome these issues, we propose an explicit Functional Regularization that is a convex regularizer in function space and can easily be tuned. We analyze the convergence of our method theoretically and empirically demonstrate that replacing Target Networks with the more theoretically grounded Functional Regularization…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Domain Adaptation and Few-Shot Learning · Machine Learning and ELM

MethodsNetwork On Network