Unicorn: Continual Learning with a Universal, Off-policy Agent

Daniel J. Mankowitz; Augustin \v{Z}\'idek; Andr\'e Barreto; Dan; Horgan; Matteo Hessel; John Quan; Junhyuk Oh; Hado van Hasselt; David Silver,; Tom Schaul

arXiv:1802.08294·cs.LG·July 4, 2018·37 cites

Unicorn: Continual Learning with a Universal, Off-policy Agent

Daniel J. Mankowitz, Augustin \v{Z}\'idek, Andr\'e Barreto, Dan, Horgan, Matteo Hessel, John Quan, Junhyuk Oh, Hado van Hasselt, David Silver,, Tom Schaul

PDF

Open Access

TL;DR

Unicorn is a novel off-policy agent architecture designed for continual learning in complex, task-agnostic environments, demonstrating superior performance by jointly learning multiple policies in a challenging 3D domain.

Contribution

The paper introduces Unicorn, a universal off-policy agent that effectively handles continual learning without explicit task boundaries by jointly representing multiple policies.

Findings

01

Unicorn outperforms baseline agents in a complex 3D continual learning domain.

02

The agent efficiently learns multiple policies simultaneously.

03

Unicorn demonstrates strong continual learning capabilities in implicit task sequences.

Abstract

Some real-world domains are best characterized as a single task, but for others this perspective is limiting. Instead, some tasks continually grow in complexity, in tandem with the agent's competence. In continual learning, also referred to as lifelong learning, there are no explicit task boundaries or curricula. As learning agents have become more powerful, continual learning remains one of the frontiers that has resisted quick progress. To test continual learning capabilities we consider a challenging 3D domain with an implicit sequence of tasks and sparse rewards. We propose a novel agent architecture called Unicorn, which demonstrates strong continual learning and outperforms several baseline agents on the proposed domain. The agent achieves this by jointly representing and learning multiple policies efficiently, using a parallel off-policy learning setup.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications · Reinforcement Learning in Robotics