Active exploration in parameterized reinforcement learning

Mehdi Khamassi; Costas Tzafestas

arXiv:1610.01986·cs.LG·October 7, 2016

Active exploration in parameterized reinforcement learning

Mehdi Khamassi, Costas Tzafestas

PDF

Open Access 1 Repo

TL;DR

This paper introduces an active exploration algorithm for parameterized reinforcement learning in continuous action spaces, dynamically adjusting exploration parameters to improve performance in non-stationary environments.

Contribution

It proposes a novel meta-learning based method to automatically tune exploration parameters in structured continuous action spaces for RL.

Findings

01

Outperforms non-active exploration RL methods in a virtual human-robot interaction task.

02

Demonstrates the effectiveness of adaptive exploration in non-stationary environments.

03

Shows that meta-learning can effectively tune exploration parameters for better performance.

Abstract

Online model-free reinforcement learning (RL) methods with continuous actions are playing a prominent role when dealing with real-world applications such as Robotics. However, when confronted to non-stationary environments, these methods crucially rely on an exploration-exploitation trade-off which is rarely dynamically and automatically adjusted to changes in the environment. Here we propose an active exploration algorithm for RL in structured (parameterized) continuous action space. This framework deals with a set of discrete actions, each of which is parameterized with continuous variables. Discrete exploration is controlled through a Boltzmann softmax function with an inverse temperature $β$ parameter. In parallel, a Gaussian exploration is applied to the continuous action parameters. We apply a meta-learning algorithm based on the comparison between variations of short-term and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

MehdiKhamassi/SocialMetaLearning
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Advanced Control Systems Optimization · Advanced Bandit Algorithms Research

MethodsGaussian Process · Q-Learning · Softmax