Grounded Language Learning in a Simulated 3D World

Karl Moritz Hermann; Felix Hill; Simon Green; Fumin Wang; Ryan; Faulkner; Hubert Soyer; David Szepesvari; Wojciech Marian Czarnecki; Max; Jaderberg; Denis Teplyashin; Marcus Wainwright; Chris Apps; Demis Hassabis,; Phil Blunsom

arXiv:1706.06551·cs.CL·June 27, 2017·153 cites

Grounded Language Learning in a Simulated 3D World

Karl Moritz Hermann, Felix Hill, Simon Green, Fumin Wang, Ryan, Faulkner, Hubert Soyer, David Szepesvari, Wojciech Marian Czarnecki, Max, Jaderberg, Denis Teplyashin, Marcus Wainwright, Chris Apps, Demis Hassabis,, Phil Blunsom

PDF

Open Access 1 Repo

TL;DR

This paper introduces a simulated 3D environment where an agent learns to interpret and ground language through reinforcement and unsupervised learning, enabling understanding of novel instructions and improving language acquisition efficiency.

Contribution

The study presents a novel grounded language learning agent that generalizes language understanding in a complex 3D environment using combined learning methods.

Findings

01

Agent successfully interprets language in new situations

02

Learning speed increases with accumulated semantic knowledge

03

Agent relates language to perceptual and action representations

Abstract

We are increasingly surrounded by artificially intelligent technology that takes decisions and executes actions on our behalf. This creates a pressing need for general means to communicate with, instruct and guide artificial agents, with human language the most compelling means for such communication. To achieve this in a scalable fashion, agents must be able to relate language to the world and to actions; that is, their understanding of language must be grounded and embodied. However, learning grounded language is a notoriously challenging problem in artificial intelligence research. Here we present an agent that learns to interpret language in a simulated 3D environment where it is rewarded for the successful execution of written instructions. Trained via a combination of reinforcement and unsupervised learning, and beginning with minimal prior knowledge, the agent learns to relate…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

SophiaAr/OpenAI-final-project
tf

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Speech and dialogue systems · Multimodal Machine Learning Applications

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings