Representation Learning for Grounded Spatial Reasoning

Michael Janner; Karthik Narasimhan; Regina Barzilay

arXiv:1707.03938·cs.CL·November 15, 2017·1 cites

Representation Learning for Grounded Spatial Reasoning

Michael Janner, Karthik Narasimhan, Regina Barzilay

PDF

Open Access 1 Repo

TL;DR

This paper introduces a reinforcement learning-based model that learns grounded spatial representations to improve spatial reasoning in simulated environments, significantly reducing goal localization errors.

Contribution

It presents a novel representation learning approach that aligns language with environment context for enhanced spatial reasoning capabilities.

Findings

01

45% reduction in goal localization error

02

Outperforms state-of-the-art methods on multiple metrics

03

Effective handling of local and global spatial references

Abstract

The interpretation of spatial references is highly contextual, requiring joint inference over both language and the environment. We consider the task of spatial reasoning in a simulated environment, where an agent can act and receive rewards. The proposed model learns a representation of the world steered by instruction text. This design allows for precise alignment of local neighborhoods with corresponding verbalizations, while also handling global references in the instructions. We train our model with reinforcement learning using a variant of generalized value iteration. The model outperforms state-of-the-art approaches on several metrics, yielding a 45% reduction in goal localization error.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

JannerM/spatial-reasoning
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Speech and dialogue systems · Topic Modeling