Discovery of Useful Questions as Auxiliary Tasks
Vivek Veeriah, Matteo Hessel, Zhongwen Xu, Richard Lewis, Janarthanan, Rajendran, Junhyuk Oh, Hado van Hasselt, David Silver, Satinder Singh

TL;DR
This paper introduces a novel reinforcement learning method where agents autonomously discover their own questions, formulated as general value functions, to improve learning efficiency and representation quality through meta-gradient optimization.
Contribution
The paper presents a new approach for RL agents to discover useful auxiliary questions as GVFs using non-myopic meta-gradients, enhancing learning and data efficiency.
Findings
Discovered GVFs support main task learning without hand-designed tasks.
Meta-learned GVFs outperform traditional auxiliary tasks.
Improved data efficiency in Atari 2600 experiments.
Abstract
Arguably, intelligent agents ought to be able to discover their own questions so that in learning answers for them they learn unanticipated useful knowledge and skills; this departs from the focus in much of machine learning on agents learning answers to externally defined questions. We present a novel method for a reinforcement learning (RL) agent to discover questions formulated as general value functions or GVFs, a fairly rich form of knowledge representation. Specifically, our method uses non-myopic meta-gradients to learn GVF-questions such that learning answers to them, as an auxiliary task, induces useful representations for the main task faced by the RL agent. We demonstrate that auxiliary tasks based on the discovered GVFs are sufficient, on their own, to build representations that support main task learning, and that they do so better than popular hand-designed auxiliary tasks…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Reservoir Computing · Reinforcement Learning in Robotics
