Learning Embeddings that Capture Spatial Semantics for Indoor Navigation
Vidhi Jain, Prakhar Agarwal, Shishir Patil, Katia Sycara

TL;DR
This paper introduces a method for creating object embeddings that encode spatial semantics, improving indoor navigation by leveraging pre-trained language models and knowledge bases to guide search tasks in unseen environments.
Contribution
The work presents a novel approach to embed spatial semantic priors into object representations for indoor navigation, enhancing search efficiency in unseen environments.
Findings
Embeddings improve search success rate in indoor environments.
Pre-trained language models effectively encode spatial semantics.
Method outperforms baseline approaches in AI2Thor simulator.
Abstract
Incorporating domain-specific priors in search and navigation tasks has shown promising results in improving generalization and sample complexity over end-to-end trained policies. In this work, we study how object embeddings that capture spatial semantic priors can guide search and navigation tasks in a structured environment. We know that humans can search for an object like a book, or a plate in an unseen house, based on the spatial semantics of bigger objects detected. For example, a book is likely to be on a bookshelf or a table, whereas a plate is likely to be in a cupboard or dishwasher. We propose a method to incorporate such spatial semantic awareness in robots by leveraging pre-trained language models and multi-relational knowledge bases as object embeddings. We demonstrate using these object embeddings to search a query object in an unseen indoor environment. We measure the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning · Topic Modeling
