DIP-RL: Demonstration-Inferred Preference Learning in Minecraft
Ellen Novoseller, Vinicius G. Goecks, David Watkins, Josh Miller,, Nicholas Waytowich

TL;DR
DIP-RL is a novel reinforcement learning approach that uses human demonstrations and inferred preferences to learn reward functions in unstructured environments like Minecraft, enabling more human-aligned agent behaviors.
Contribution
The paper introduces DIP-RL, a new method that leverages demonstrations and preference inference to guide RL without explicit reward signals in complex environments.
Findings
DIP-RL effectively learns reward functions reflecting human preferences.
The method performs competitively against baselines in Minecraft tasks.
DIP-RL successfully integrates demonstrations and preference inference for RL.
Abstract
In machine learning for sequential decision-making, an algorithmic agent learns to interact with an environment while receiving feedback in the form of a reward signal. However, in many unstructured real-world settings, such a reward signal is unknown and humans cannot reliably craft a reward signal that correctly captures desired behavior. To solve tasks in such unstructured and open-ended environments, we present Demonstration-Inferred Preference Reinforcement Learning (DIP-RL), an algorithm that leverages human demonstrations in three distinct ways, including training an autoencoder, seeding reinforcement learning (RL) training batches with demonstration data, and inferring preferences over behaviors to learn a reward function to guide RL. We evaluate DIP-RL in a tree-chopping task in Minecraft. Results suggest that the method can guide an RL agent to learn a reward function that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAnomaly Detection Techniques and Applications
