# An Active Learning Framework for Efficient Robust Policy Search

**Authors:** Sai Kiran Narayanaswami, Nandan Sudarsanam, Balaraman Ravindran

arXiv: 1901.00117 · 2021-11-23

## TL;DR

This paper introduces EffAcTS, an active learning framework that improves sample efficiency in robust policy search by selectively sampling environment parameters, validated on continuous control tasks.

## Contribution

The paper proposes a novel active learning framework for robust policy search, utilizing Linear Bandits to reduce data collection while maintaining performance.

## Key findings

- Enhanced sample efficiency demonstrated on continuous control tasks
- Effective selection of environment parameters improves robustness
- Connections established between robust policy search and multi-task learning

## Abstract

Robust Policy Search is the problem of learning policies that do not degrade in performance when subject to unseen environment model parameters. It is particularly relevant for transferring policies learned in a simulation environment to the real world. Several existing approaches involve sampling large batches of trajectories which reflect the differences in various possible environments, and then selecting some subset of these to learn robust policies, such as the ones that result in the worst performance. We propose an active learning based framework, EffAcTS, to selectively choose model parameters for this purpose so as to collect only as much data as necessary to select such a subset. We apply this framework using Linear Bandits, and experimentally validate the gains in sample efficiency and the performance of our approach on standard continuous control tasks. We also present a Multi-Task Learning perspective to the problem of Robust Policy Search, and draw connections from our proposed framework to existing work on Multi-Task Learning.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1901.00117/full.md

## Figures

15 figures with captions in the complete paper: https://tomesphere.com/paper/1901.00117/full.md

## References

35 references — full list in the complete paper: https://tomesphere.com/paper/1901.00117/full.md

---
Source: https://tomesphere.com/paper/1901.00117