Clustered Reinforcement Learning

Xiao Ma; Shen-Yi Zhao; Wu-Jun Li

arXiv:1906.02457·cs.LG·June 7, 2019·1 cites

Clustered Reinforcement Learning

Xiao Ma, Shen-Yi Zhao, Wu-Jun Li

PDF

Open Access

TL;DR

This paper introduces CRL, a clustering-based reinforcement learning framework that enhances exploration by utilizing cluster-based novelty and quality rewards, leading to improved performance in complex environments.

Contribution

The paper proposes a novel clustering-based exploration method, CRL, which effectively guides RL agents by leveraging cluster information to balance exploration and exploitation.

Findings

01

CRL outperforms existing methods in continuous control tasks.

02

CRL achieves superior results on several Atari 2600 games.

03

Clustering-based rewards improve exploration efficiency.

Abstract

Exploration strategy design is one of the challenging problems in reinforcement learning~(RL), especially when the environment contains a large state space or sparse rewards. During exploration, the agent tries to discover novel areas or high reward~(quality) areas. In most existing methods, the novelty and quality in the neighboring area of the current state are not well utilized to guide the exploration of the agent. To tackle this problem, we propose a novel RL framework, called \underline{c}lustered \underline{r}einforcement \underline{l}earning~(CRL), for efficient exploration in RL. CRL adopts clustering to divide the collected states into several clusters, based on which a bonus reward reflecting both novelty and quality in the neighboring area~(cluster) of the current state is given to the agent. Experiments on a continuous control task and several \emph{Atari 2600} games show…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Advanced Bandit Algorithms Research · Artificial Intelligence in Games