# CLIC: Curriculum Learning and Imitation for object Control in   non-rewarding environments

**Authors:** Pierre Fournier, Olivier Sigaud, C\'edric Colas, Mohamed Chetouani

arXiv: 1901.09720 · 2019-03-26

## TL;DR

This paper introduces CLIC, an unsupervised reinforcement learning agent designed for non-rewarding environments with multiple objects, enabling control and imitation of an independent agent's interactions to learn efficiently.

## Contribution

The paper presents CLIC, a novel unsupervised RL approach that learns object control and imitation in complex, non-rewarding environments with hierarchical object interactions.

## Key findings

- CLIC accelerates object control learning by observing an independent agent.
- CLIC effectively imitates and follows demonstrations even without explicit teaching.
- CLIC ignores non-reproducible interactions, improving learning efficiency.

## Abstract

In this paper we study a new reinforcement learning setting where the environment is non-rewarding, contains several possibly related objects of various controllability, and where an apt agent Bob acts independently, with non-observable intentions. We argue that this setting defines a realistic scenario and we present a generic discrete-state discrete-action model of such environments. To learn in this environment, we propose an unsupervised reinforcement learning agent called CLIC for Curriculum Learning and Imitation for Control. CLIC learns to control individual objects in its environment, and imitates Bob's interactions with these objects. It selects objects to focus on when training and imitating by maximizing its learning progress. We show that CLIC is an effective baseline in our new setting. It can effectively observe Bob to gain control of objects faster, even if Bob is not explicitly teaching. It can also follow Bob when he acts as a mentor and provides ordered demonstrations. Finally, when Bob controls objects that the agent cannot, or in presence of a hierarchy between objects in the environment, we show that CLIC ignores non-reproducible and already mastered interactions with objects, resulting in a greater benefit from imitation.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1901.09720/full.md

## Figures

10 figures with captions in the complete paper: https://tomesphere.com/paper/1901.09720/full.md

## References

36 references — full list in the complete paper: https://tomesphere.com/paper/1901.09720/full.md

---
Source: https://tomesphere.com/paper/1901.09720