Human Instruction-Following with Deep Reinforcement Learning via Transfer-Learning from Text
Felix Hill, Sona Mokra, Nathaniel Wong, Tim Harley

TL;DR
This paper introduces a method combining deep reinforcement learning with pre-trained language models to enable instruction-following agents to understand and execute natural human commands in 3D simulated environments, demonstrating effective zero-shot transfer.
Contribution
The paper presents a simple transfer-learning approach using BERT to train deep RL agents for natural language instruction-following, bridging language understanding and motor control.
Findings
Substantially above-chance zero-shot transfer from synthetic to natural instructions
Effective training of instruction-following agents with natural language in 3D environments
Bridges agent motor behavior with text-based language understanding
Abstract
Recent work has described neural-network-based agents that are trained with reinforcement learning (RL) to execute language-like commands in simulated worlds, as a step towards an intelligent agent or robot that can be instructed by human users. However, the optimisation of multi-goal motor policies via deep RL from scratch requires many episodes of experience. Consequently, instruction-following with deep RL typically involves language generated from templates (by an environment simulator), which does not reflect the varied or ambiguous expressions of real users. Here, we propose a conceptually simple method for training instruction-following agents with deep RL that are robust to natural human instructions. By applying our method with a state-of-the-art pre-trained text-based language model (BERT), on tasks requiring agents to identify and position everyday objects relative to other…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Topic Modeling · Natural Language Processing Techniques
