NetHack is Hard to Hack
Ulyana Piterbarg, Lerrel Pinto, Rob Fergus

TL;DR
This paper investigates why neural policy learning struggles in complex, long-horizon environments like NetHack, and develops a neural agent that significantly improves performance but still lags behind symbolic and human strategies.
Contribution
The paper provides an extensive analysis of neural policy learning in NetHack, introduces a large demonstration dataset, and achieves a new state-of-the-art neural agent performance.
Findings
Neural agents outperform previous policies by 127% offline and 25% online.
Symbolic agents outperform neural approaches by over four times in median score.
Scaling neural models alone is insufficient to match symbolic or human performance.
Abstract
Neural policy learning methods have achieved remarkable results in various control problems, ranging from Atari games to simulated locomotion. However, these methods struggle in long-horizon tasks, especially in open-ended environments with multi-modal observations, such as the popular dungeon-crawler game, NetHack. Intriguingly, the NeurIPS 2021 NetHack Challenge revealed that symbolic agents outperformed neural approaches by over four times in median game score. In this paper, we delve into the reasons behind this performance gap and present an extensive study on neural policy learning for NetHack. To conduct this study, we analyze the winning symbolic agent, extending its codebase to track internal strategy selection in order to generate one of the largest available demonstration datasets. Utilizing this dataset, we examine (i) the advantages of an action hierarchy; (ii) enhancements…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsReinforcement Learning in Robotics · Artificial Intelligence in Games · Educational Games and Gamification
