Curiosity-Driven Experience Prioritization via Density Estimation
Rui Zhao, Volker Tresp

TL;DR
This paper introduces a curiosity-driven prioritization method for reinforcement learning that emphasizes rare goal-achieving trajectories, improving learning efficiency and performance in robotic tasks.
Contribution
The paper proposes a novel CDP framework that over-samples rare goal states, enhancing RL training by mimicking human curiosity and addressing data imbalance.
Findings
CDP improves RL performance in robotic tasks.
CDP increases sample efficiency of RL agents.
Combining CDP with DDPG and HER yields superior results.
Abstract
In Reinforcement Learning (RL), an agent explores the environment and collects trajectories into the memory buffer for later learning. However, the collected trajectories can easily be imbalanced with respect to the achieved goal states. The problem of learning from imbalanced data is a well-known problem in supervised learning, but has not yet been thoroughly researched in RL. To address this problem, we propose a novel Curiosity-Driven Prioritization (CDP) framework to encourage the agent to over-sample those trajectories that have rare achieved goal states. The CDP framework mimics the human learning process and focuses more on relatively uncommon events. We evaluate our methods using the robotic environment provided by OpenAI Gym. The environment contains six robot manipulation tasks. In our experiments, we combined CDP with Deep Deterministic Policy Gradient (DDPG) with or without…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVisual Attention and Saliency Detection · Psychological and Educational Research Studies · Mobile Crowdsensing and Crowdsourcing
