# Adaptive Prior Selection for Repertoire-based Online Adaptation in   Robotics

**Authors:** Rituraj Kaushik, Pierre Desreumaux, Jean-Baptiste Mouret

arXiv: 1907.07029 · 2020-03-05

## TL;DR

This paper introduces APROL, an algorithm that adaptively selects from multiple repertoires of policies for online robot adaptation, improving efficiency in damaged or altered scenarios through prior selection.

## Contribution

The paper proposes APROL, a novel algorithm that dynamically chooses the most relevant policy repertoire for online adaptation in robotics, relaxing the single-repertoire assumption.

## Key findings

- APROL outperforms baselines in simulated tasks, reducing interaction time.
- APROL enables a damaged hexapod to quickly learn compensatory behaviors.
- The method is successfully demonstrated on a real damaged hexapod robot.

## Abstract

Repertoire-based learning is a data-efficient adaptation approach based on a two-step process in which (1) a large and diverse set of policies is learned in simulation, and (2) a planning or learning algorithm chooses the most appropriate policies according to the current situation (e.g., a damaged robot, a new object, etc.). In this paper, we relax the assumption of previous works that a single repertoire is enough for adaptation. Instead, we generate repertoires for many different situations (e.g., with a missing leg, on different floors, etc.) and let our algorithm selects the most useful prior. Our main contribution is an algorithm, APROL (Adaptive Prior selection for Repertoire-based Online Learning) to plan the next action by incorporating these priors when the robot has no information about the current situation. We evaluate APROL on two simulated tasks: (1) pushing unknown objects of various shapes and sizes with a robotic arm and (2) a goal reaching task with a damaged hexapod robot. We compare with "Reset-free Trial and Error" (RTE) and various single repertoire-based baselines. The results show that APROL solves both the tasks in less interaction time than the baselines. Additionally, we demonstrate APROL on a real, damaged hexapod that quickly learns to pick compensatory policies to reach a goal by avoiding obstacles in the path.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1907.07029/full.md

## Figures

8 figures with captions in the complete paper: https://tomesphere.com/paper/1907.07029/full.md

## References

37 references — full list in the complete paper: https://tomesphere.com/paper/1907.07029/full.md

---
Source: https://tomesphere.com/paper/1907.07029