A Deep Hierarchical Approach to Lifelong Learning in Minecraft

Chen Tessler; Shahar Givony; Tom Zahavy; Daniel J. Mankowitz; Shie; Mannor

arXiv:1604.07255·cs.AI·December 1, 2016·141 cites

A Deep Hierarchical Approach to Lifelong Learning in Minecraft

Chen Tessler, Shahar Givony, Tom Zahavy, Daniel J. Mankowitz, Shie, Mannor

PDF

Open Access

TL;DR

This paper introduces a hierarchical deep reinforcement learning system for lifelong learning in Minecraft, enabling efficient knowledge transfer and retention through reusable skills and skill distillation, outperforming standard DQNs.

Contribution

It presents a novel hierarchical architecture with skill distillation for lifelong learning, specifically applied to complex Minecraft tasks.

Findings

01

H-DRLN outperforms standard Deep Q Networks in Minecraft sub-domains.

02

The system demonstrates efficient knowledge retention and transfer.

03

Skill distillation enables scalable lifelong learning.

Abstract

We propose a lifelong learning system that has the ability to reuse and transfer knowledge from one task to another while efficiently retaining the previously learned knowledge-base. Knowledge is transferred by learning reusable skills to solve tasks in Minecraft, a popular video game which is an unsolved and high-dimensional lifelong learning problem. These reusable skills, which we refer to as Deep Skill Networks, are then incorporated into our novel Hierarchical Deep Reinforcement Learning Network (H-DRLN) architecture using two techniques: (1) a deep skill array and (2) skill distillation, our novel variation of policy distillation (Rusu et. al. 2015) for learning skills. Skill distillation enables the HDRLN to efficiently retain knowledge and therefore scale in lifelong learning, by accumulating knowledge and encapsulating multiple reusable skills into a single distilled network.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Adaptive Dynamic Programming Control · Advanced Bandit Algorithms Research