TL;DR
This paper introduces a neural approach to predict spatial arrangements from implicit spatial language, demonstrating models can learn and generalize common sense spatial knowledge from annotated data.
Contribution
It extends spatial templates to implicit relationships and shows neural models can predict spatial configurations, even for unseen object-relation combinations and objects using word embeddings.
Findings
Models predict spatial locations accurately from implicit language.
Performance remains strong in generalized unseen object-relation scenarios.
Word embeddings enable spatial prediction for unseen objects.
Abstract
Spatial understanding is a fundamental problem with wide-reaching real-world applications. The representation of spatial knowledge is often modeled with spatial templates, i.e., regions of acceptability of two objects under an explicit spatial relationship (e.g., "on", "below", etc.). In contrast with prior work that restricts spatial templates to explicit spatial prepositions (e.g., "glass on table"), here we extend this concept to implicit spatial language, i.e., those relationships (generally actions) for which the spatial arrangement of the objects is only implicitly implied (e.g., "man riding horse"). In contrast with explicit relationships, predicting spatial arrangements from implicit spatial language requires significant common sense spatial understanding. Here, we introduce the task of predicting spatial templates for two objects under a relationship, which can be seen as a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
