Many-Goals Reinforcement Learning

Vivek Veeriah; Junhyuk Oh; Satinder Singh

arXiv:1806.09605·cs.LG·June 27, 2018·29 cites

Many-Goals Reinforcement Learning

Vivek Veeriah, Junhyuk Oh, Satinder Singh

PDF

Open Access

TL;DR

This paper investigates the application of many-goals reinforcement learning in deep neural network settings, exploring its use for mastery, pre-training, and auxiliary tasks to improve learning efficiency and performance.

Contribution

It extends the concept of many-goals updating from small-state tabular RL to deep RL, demonstrating its effectiveness in mastery, pre-training, and auxiliary learning tasks.

Findings

01

Many-goals updating improves learning in visual domains.

02

Pre-training with many-goals accelerates task learning.

03

Auxiliary goals enhance main task performance.

Abstract

All-goals updating exploits the off-policy nature of Q-learning to update all possible goals an agent could have from each transition in the world, and was introduced into Reinforcement Learning (RL) by Kaelbling (1993). In prior work this was mostly explored in small-state RL problems that allowed tabular representations and where all possible goals could be explicitly enumerated and learned separately. In this paper we empirically explore 3 different extensions of the idea of updating many (instead of all) goals in the context of RL with deep neural networks (or DeepRL for short). First, in a direct adaptation of Kaelbling's approach we explore if many-goals updating can be used to achieve mastery in non-tabular visual-observation domains. Second, we explore whether many-goals updating can be used to pre-train a network to subsequently learn faster and better on a single main task of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Smart Grid Energy Management · Adaptive Dynamic Programming Control

MethodsQ-Learning