Discovery of Useful Questions as Auxiliary Tasks

Vivek Veeriah; Matteo Hessel; Zhongwen Xu; Richard Lewis; Janarthanan; Rajendran; Junhyuk Oh; Hado van Hasselt; David Silver; Satinder Singh

arXiv:1909.04607·cs.AI·September 11, 2019·38 cites

Discovery of Useful Questions as Auxiliary Tasks

Vivek Veeriah, Matteo Hessel, Zhongwen Xu, Richard Lewis, Janarthanan, Rajendran, Junhyuk Oh, Hado van Hasselt, David Silver, Satinder Singh

PDF

Open Access

TL;DR

This paper introduces a novel reinforcement learning method where agents autonomously discover their own questions, formulated as general value functions, to improve learning efficiency and representation quality through meta-gradient optimization.

Contribution

The paper presents a new approach for RL agents to discover useful auxiliary questions as GVFs using non-myopic meta-gradients, enhancing learning and data efficiency.

Findings

01

Discovered GVFs support main task learning without hand-designed tasks.

02

Meta-learned GVFs outperform traditional auxiliary tasks.

03

Improved data efficiency in Atari 2600 experiments.

Abstract

Arguably, intelligent agents ought to be able to discover their own questions so that in learning answers for them they learn unanticipated useful knowledge and skills; this departs from the focus in much of machine learning on agents learning answers to externally defined questions. We present a novel method for a reinforcement learning (RL) agent to discover questions formulated as general value functions or GVFs, a fairly rich form of knowledge representation. Specifically, our method uses non-myopic meta-gradients to learn GVF-questions such that learning answers to them, as an auxiliary task, induces useful representations for the main task faced by the RL agent. We demonstrate that auxiliary tasks based on the discovered GVFs are sufficient, on their own, to build representations that support main task learning, and that they do so better than popular hand-designed auxiliary tasks…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Reservoir Computing · Reinforcement Learning in Robotics