Exploration by Random Distribution Distillation
Zhirui Fang, Kai Yang, Jian Tao, Jiafei Lyu, Lusong Li, Li Shen, Xiu Li

TL;DR
This paper introduces Random Distribution Distillation (RDD), a novel exploration method in reinforcement learning that combines count-based and prediction-error approaches through stochastic target network outputs, enhancing exploration efficiency.
Contribution
RDD is a new exploration technique that models target network outputs as samples from a normal distribution, unifying count-based and prediction-error methods in RL.
Findings
RDD improves exploration in high-dimensional spaces.
Experimental results show RDD outperforms existing methods.
Theoretical analysis confirms RDD's effectiveness.
Abstract
Exploration remains a critical challenge in online reinforcement learning, as an agent must effectively explore unknown environments to achieve high returns. Currently, the main exploration algorithms are primarily count-based methods and curiosity-based methods, with prediction-error methods being a prominent example. In this paper, we propose a novel method called \textbf{R}andom \textbf{D}istribution \textbf{D}istillation (RDD), which samples the output of a target network from a normal distribution. RDD facilitates a more extensive exploration by explicitly treating the difference between the prediction network and the target network as an intrinsic reward. Furthermore, by introducing randomness into the output of the target network for a given state and modeling it as a sample from a normal distribution, intrinsic rewards are bounded by two key components: a pseudo-count term…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Advanced Bandit Algorithms Research · Stochastic Gradient Optimization Techniques
