Faster Deep Reinforcement Learning with Slower Online Network

Kavosh Asadi; Rasool Fakoor; Omer Gottesman; Taesup Kim; Michael L.; Littman; Alexander J. Smola

arXiv:2112.05848·cs.LG·April 19, 2023·1 cites

Faster Deep Reinforcement Learning with Slower Online Network

Kavosh Asadi, Rasool Fakoor, Omer Gottesman, Taesup Kim, Michael L., Littman, Alexander J. Smola

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces a simple modification to deep reinforcement learning algorithms that encourages the online network to stay close to the target network, resulting in improved robustness and performance on Atari benchmarks.

Contribution

The paper proposes a novel update mechanism that keeps the online network near the target network, enhancing stability and performance in deep RL algorithms.

Findings

01

DQN Pro outperforms DQN on Atari benchmarks.

02

Rainbow Pro surpasses Rainbow in robustness and accuracy.

03

The proposed method improves training stability in noisy environments.

Abstract

Deep reinforcement learning algorithms often use two networks for value function optimization: an online network, and a target network that tracks the online network with some delay. Using two separate networks enables the agent to hedge against issues that arise when performing bootstrapping. In this paper we endow two popular deep reinforcement learning algorithms, namely DQN and Rainbow, with updates that incentivize the online network to remain in the proximity of the target network. This improves the robustness of deep reinforcement learning in presence of noisy updates. The resultant agents, called DQN Pro and Rainbow Pro, exhibit significant performance improvements over their original counterparts on the Atari benchmark demonstrating the effectiveness of this simple idea in deep reinforcement learning. The code for our paper is available here:…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

amazon-research/fast-rl-with-slow-updates
jaxOfficial

Videos

Faster Deep Reinforcement Learning with Slower Online Network· slideslive

Taxonomy

TopicsReinforcement Learning in Robotics · Smart Grid Energy Management · Smart Parking Systems Research

MethodsDense Connections · Q-Learning · Convolution · Deep Q-Network