Simple Embodied Language Learning as a Byproduct of Meta-Reinforcement Learning
Evan Zheran Liu, Sahaana Suri, Tong Mu, Allan Zhou, Chelsea Finn

TL;DR
This paper demonstrates that embodied reinforcement learning agents can implicitly learn language by solving non-language tasks in a multi-task environment, generalizing to new layouts and descriptions without explicit language training.
Contribution
It shows that language can emerge as a byproduct of solving navigation tasks in a multi-task environment using meta-RL, without direct language supervision.
Findings
RL agents generalize to unseen layouts and language phrases
Agents navigate correctly without direct language supervision
Language emerges as a byproduct of task-solving in embodied RL
Abstract
Whereas machine learning models typically learn language by directly training on language tasks (e.g., next-word prediction), language emerges in human children as a byproduct of solving non-language tasks (e.g., acquiring food). Motivated by this observation, we ask: can embodied reinforcement learning (RL) agents also indirectly learn language from non-language tasks? Learning to associate language with its meaning requires a dynamic environment with varied language. Therefore, we investigate this question in a multi-task environment with language that varies across the different tasks. Specifically, we design an office navigation environment, where the agent's goal is to find a particular office, and office locations differ in different buildings (i.e., tasks). Each building includes a floor plan with a simple language description of the goal office's location, which can be visually…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsReinforcement Learning in Robotics · Multimodal Machine Learning Applications · Language and cultural evolution
