# Identification and Off-Policy Learning of Multiple Objectives Using   Adaptive Clustering

**Authors:** Thommen George Karimpanal, Erik Wilhelm

arXiv: 1705.06342 · 2019-01-11

## TL;DR

This paper introduces an adaptive clustering method that enables an agent to autonomously identify multiple objectives in its environment and learn them in parallel using off-policy Q-learning, improving efficiency in exploration and knowledge accumulation.

## Contribution

The work presents a novel unsupervised adaptive clustering approach for objective identification and demonstrates its integration with off-policy learning to handle multiple unknown objectives simultaneously.

## Key findings

- Converged value functions encode multiple objectives.
- Method reduces exploration time in unknown environments.
- Parallel learning improves efficiency and knowledge retention.

## Abstract

In this work, we present a methodology that enables an agent to make efficient use of its exploratory actions by autonomously identifying possible objectives in its environment and learning them in parallel. The identification of objectives is achieved using an online and unsupervised adaptive clustering algorithm. The identified objectives are learned (at least partially) in parallel using Q-learning. Using a simulated agent and environment, it is shown that the converged or partially converged value function weights resulting from off-policy learning can be used to accumulate knowledge about multiple objectives without any additional exploration. We claim that the proposed approach could be useful in scenarios where the objectives are initially unknown or in real world scenarios where exploration is typically a time and energy intensive process. The implications and possible extensions of this work are also briefly discussed.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1705.06342/full.md

## Figures

6 figures with captions in the complete paper: https://tomesphere.com/paper/1705.06342/full.md

## References

31 references — full list in the complete paper: https://tomesphere.com/paper/1705.06342/full.md

---
Source: https://tomesphere.com/paper/1705.06342