Noisy Networks for Exploration
Meire Fortunato, Mohammad Gheshlaghi Azar, Bilal Piot, Jacob Menick,, Ian Osband, Alex Graves, Vlad Mnih, Remi Munos, Demis Hassabis, Olivier, Pietquin, Charles Blundell, Shane Legg

TL;DR
NoisyNet introduces learnable parametric noise into deep reinforcement learning agents, enhancing exploration efficiency and significantly improving performance across various Atari games compared to traditional exploration methods.
Contribution
The paper presents NoisyNet, a simple yet effective method of incorporating learnable noise into network weights to improve exploration in deep reinforcement learning.
Findings
NoisyNet outperforms traditional exploration heuristics in Atari games.
Replacing entropy and epsilon-greedy exploration with NoisyNet yields higher scores.
In some cases, NoisyNet achieves super-human performance.
Abstract
We introduce NoisyNet, a deep reinforcement learning agent with parametric noise added to its weights, and show that the induced stochasticity of the agent's policy can be used to aid efficient exploration. The parameters of the noise are learned with gradient descent along with the remaining network weights. NoisyNet is straightforward to implement and adds little computational overhead. We find that replacing the conventional exploration heuristics for A3C, DQN and dueling agents (entropy reward and -greedy respectively) with NoisyNet yields substantially higher scores for a wide range of Atari games, in some cases advancing the agent from sub to super-human performance.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Adversarial Robustness in Machine Learning · Advanced Bandit Algorithms Research
MethodsDouble Q-learning · Softmax · Entropy Regularization · A3C · Dueling Network · Q-Learning · NoisyNet-A3C · NoisyNet-Dueling · NoisyNet-DQN · Noisy Linear Layer
