Learning the Target Network in Function Space

Kavosh Asadi; Yao Liu; Shoham Sabach; Ming Yin; Rasool Fakoor

arXiv:2406.01838·cs.LG·September 24, 2024

Learning the Target Network in Function Space

Kavosh Asadi, Yao Liu, Shoham Sabach, Ming Yin, Rasool Fakoor

PDF

Open Access

TL;DR

This paper introduces Lookahead-Replicate (LR), a novel value-function approximation method in reinforcement learning that maintains equivalence in function space rather than parameter space, leading to improved convergence and performance.

Contribution

LR is a new algorithm that updates target networks in function space, not parameter space, enhancing convergence and empirical performance in deep RL.

Findings

01

LR achieves stable convergence in value learning.

02

LR significantly improves deep RL performance on Atari.

03

Function space equivalence benefits deep RL stability.

Abstract

We focus on the task of learning the value function in the reinforcement learning (RL) setting. This task is often solved by updating a pair of online and target networks while ensuring that the parameters of these two networks are equivalent. We propose Lookahead-Replicate (LR), a new value-function approximation algorithm that is agnostic to this parameter-space equivalence. Instead, the LR algorithm is designed to maintain an equivalence between the two networks in the function space. This value-based equivalence is obtained by employing a new target-network update. We show that LR leads to a convergent behavior in learning the value function. We also present empirical results demonstrating that LR-based target-network updates significantly improve deep RL on the Atari benchmark.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications

MethodsFocus