Towards robust and domain agnostic reinforcement learning competitions

William Hebgen Guss; Stephanie Milani; Nicholay Topin; Brandon; Houghton; Sharada Mohanty; Andrew Melnik; Augustin Harter; Benoit Buschmaas,; Bjarne Jaster; Christoph Berganski; Dennis Heitkamp; Marko Henning; Helge; Ritter; Chengjie Wu; Xiaotian Hao; Yiming Lu; Hangyu Mao; Yihuan Mao; Chao; Wang; Michal Opanowicz; Anssi Kanervisto; Yanick Schraner; Christian; Scheller; Xiren Zhou; Lu Liu; Daichi Nishio; Toi Tsuneda; Karolis; Ramanauskas; Gabija Juceviciute

arXiv:2106.03748·cs.LG·June 8, 2021·1 cites

Towards robust and domain agnostic reinforcement learning competitions

William Hebgen Guss, Stephanie Milani, Nicholay Topin, Brandon, Houghton, Sharada Mohanty, Andrew Melnik, Augustin Harter, Benoit Buschmaas,, Bjarne Jaster, Christoph Berganski, Dennis Heitkamp, Marko Henning, Helge, Ritter, Chengjie Wu, Xiaotian Hao, Yiming Lu, Hangyu Mao

PDF

Open Access

TL;DR

This paper introduces a new competition framework for reinforcement learning that emphasizes reproducibility, domain-agnostic solutions, and resource efficiency, demonstrated through the MineRL 2020 Competition.

Contribution

It proposes four mechanisms—retraining, domain randomization, obfuscation, and resource limits—to improve RL competition design and showcases their effectiveness in a real-world challenge.

Findings

01

Submissions became more reproducible and domain-agnostic.

02

Participants developed sample-efficient algorithms.

03

The competition successfully promoted robust RL solutions.

Abstract

Reinforcement learning competitions have formed the basis for standard research benchmarks, galvanized advances in the state-of-the-art, and shaped the direction of the field. Despite this, a majority of challenges suffer from the same fundamental problems: participant solutions to the posed challenge are usually domain-specific, biased to maximally exploit compute resources, and not guaranteed to be reproducible. In this paper, we present a new framework of competition design that promotes the development of algorithms that overcome these barriers. We propose four central mechanisms for achieving this end: submission retraining, domain randomization, desemantization through domain obfuscation, and the limitation of competition compute and environment-sample budget. To demonstrate the efficacy of this design, we proposed, organized, and ran the MineRL 2020 Competition on…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Machine Learning and Data Classification · Robot Manipulation and Learning