Rewarding Episodic Visitation Discrepancy for Exploration in   Reinforcement Learning

Mingqi Yuan; Bo Li; Xin Jin; Wenjun Zeng

arXiv:2209.08842·cs.LG·November 11, 2022·5 cites

Rewarding Episodic Visitation Discrepancy for Exploration in Reinforcement Learning

Mingqi Yuan, Bo Li, Xin Jin, Wenjun Zeng

PDF

Open Access

TL;DR

This paper introduces REVD, a computationally efficient intrinsic reward method based on visitation discrepancy, which enhances exploration and sample efficiency in reinforcement learning without complex representations.

Contribution

The paper proposes REVD, a novel visitation discrepancy-based intrinsic reward method that is simple, efficient, and effective for exploration in reinforcement learning.

Findings

01

REVD significantly improves sample efficiency in Atari and robotics environments.

02

REVD outperforms existing exploration methods in benchmark tests.

03

REVD requires less computational complexity than prior approaches.

Abstract

Exploration is critical for deep reinforcement learning in complex environments with high-dimensional observations and sparse rewards. To address this problem, recent approaches proposed to leverage intrinsic rewards to improve exploration, such as novelty-based exploration and prediction-based exploration. However, many intrinsic reward modules require sophisticated structures and representation learning, resulting in prohibitive computational complexity and unstable performance. In this paper, we propose Rewarding Episodic Visitation Discrepancy (REVD), a computation-efficient and quantified exploration method. More specifically, REVD provides intrinsic rewards by evaluating the R\'enyi divergence-based visitation discrepancy between episodes. To make efficient divergence estimation, a k-nearest neighbor estimator is utilized with a randomly-initialized state encoder. Finally, the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Artificial Intelligence in Games · Explainable Artificial Intelligence (XAI)