# Learning to Explore in Motion and Interaction Tasks

**Authors:** Miroslav Bogdanovic, Ludovic Righetti

arXiv: 1908.03731 · 2019-08-13

## TL;DR

This paper introduces a generative exploration method leveraging past tasks to significantly improve learning efficiency in robotic motion and interaction tasks, especially with sparse rewards.

## Contribution

It presents a novel exploration strategy that uses data from previous tasks to accelerate learning and adapt to new tasks in robotic manipulation and locomotion.

## Key findings

- More than doubled learning speed in simulations
- Robust to task variations and parameter tuning
- Effective in sparse reward scenarios

## Abstract

Model free reinforcement learning suffers from the high sampling complexity inherent to robotic manipulation or locomotion tasks. Most successful approaches typically use random sampling strategies which leads to slow policy convergence. In this paper we present a novel approach for efficient exploration that leverages previously learned tasks. We exploit the fact that the same system is used across many tasks and build a generative model for exploration based on data from previously solved tasks to improve learning new tasks. The approach also enables continuous learning of improved exploration strategies as novel tasks are learned. Extensive simulations on a robot manipulator performing a variety of motion and contact interaction tasks demonstrate the capabilities of the approach. In particular, our experiments suggest that the exploration strategy can more than double learning speed, especially when rewards are sparse. Moreover, the algorithm is robust to task variations and parameter tuning, making it beneficial for complex robotic problems.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1908.03731/full.md

## Figures

15 figures with captions in the complete paper: https://tomesphere.com/paper/1908.03731/full.md

## References

21 references — full list in the complete paper: https://tomesphere.com/paper/1908.03731/full.md

---
Source: https://tomesphere.com/paper/1908.03731