Towards Solving Fuzzy Tasks with Human Feedback: A Retrospective of the MineRL BASALT 2022 Competition
Stephanie Milani, Anssi Kanervisto, Karolis Ramanauskas, Sander, Schulhoff, Brandon Houghton, Sharada Mohanty, Byron Galbraith, Ke Chen, Yan, Song, Tianze Zhou, Bingquan Yu, He Liu, Kai Guan, Yujing Hu, Tangjie Lv,, Federico Malato, Florian Leopold, Amogh Raut, Ville Hautam\"aki

TL;DR
This paper reviews the MineRL BASALT 2022 competition focused on developing algorithms that leverage human feedback to solve complex, hard-to-specify tasks in Minecraft, aiming to advance fine-tuning of foundation models.
Contribution
It provides a retrospective analysis of the competition, highlighting top solutions and discussing future directions for using human feedback in AI training.
Findings
Top solutions effectively used human feedback for task learning
The competition fostered new algorithms for fine-tuning models with human input
Insights into challenges and future research directions in human-in-the-loop learning
Abstract
To facilitate research in the direction of fine-tuning foundation models from human feedback, we held the MineRL BASALT Competition on Fine-Tuning from Human Feedback at NeurIPS 2022. The BASALT challenge asks teams to compete to develop algorithms to solve tasks with hard-to-specify reward functions in Minecraft. Through this competition, we aimed to promote the development of algorithms that use human feedback as channels to learn the desired behavior. We describe the competition and provide an overview of the top solutions. We conclude by discussing the impact of the competition and future directions for improvement.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Fuzzy Logic and Control Systems
