JueWu-MC: Playing Minecraft with Sample-efficient Hierarchical   Reinforcement Learning

Zichuan Lin; Junyou Li; Jianing Shi; Deheng Ye; Qiang Fu; Wei Yang

arXiv:2112.04907·cs.LG·December 10, 2021

JueWu-MC: Playing Minecraft with Sample-efficient Hierarchical Reinforcement Learning

Zichuan Lin, Junyou Li, Jianing Shi, Deheng Ye, Qiang Fu, Wei Yang

PDF

TL;DR

JueWu-MC is a hierarchical reinforcement learning method that significantly improves sample efficiency and performance in Minecraft by combining representation learning, imitation learning, and hierarchical control.

Contribution

The paper introduces a novel hierarchical RL framework with integrated techniques for perception and exploration, achieving state-of-the-art results in Minecraft.

Findings

01

Outperforms baseline methods in sample efficiency and performance

02

Won the NeurIPS MineRL 2021 competition

03

Achieves highest performance score in Minecraft RL tasks

Abstract

Learning rational behaviors in open-world games like Minecraft remains to be challenging for Reinforcement Learning (RL) research due to the compound challenge of partial observability, high-dimensional visual perception and delayed reward. To address this, we propose JueWu-MC, a sample-efficient hierarchical RL approach equipped with representation learning and imitation learning to deal with perception and exploration. Specifically, our approach includes two levels of hierarchy, where the high-level controller learns a policy to control over options and the low-level workers learn to solve each sub-task. To boost the learning of sub-tasks, we propose a combination of techniques including 1) action-aware representation learning which captures underlying relations between action and representation, 2) discriminator-based self-imitation learning for efficient exploration, and 3) ensemble…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.