Zero-Shot Object Searching Using Large-scale Object Relationship Prior
Hongyi Chen, Ruinian Xu, Shuo Cheng, Patricio A. Vela, Danfei Xu

TL;DR
This paper introduces a zero-shot object search method for home-assistant robots that leverages object relationships learned from large datasets, improving navigation efficiency without environment-specific training.
Contribution
It presents a novel zero-shot object search approach using a graph neural network trained on Visual Genome, integrating semantic knowledge for efficient navigation in home environments.
Findings
Outperforms prior correlational object search algorithms
Effective in both simulation and real-world environments
Integrates natural language understanding with object navigation
Abstract
Home-assistant robots have been a long-standing research topic, and one of the biggest challenges is searching for required objects in housing environments. Previous object-goal navigation requires the robot to search for a target object category in an unexplored environment, which may not be suitable for home-assistant robots that typically have some level of semantic knowledge of the environment, such as the location of static furniture. In our approach, we leverage this knowledge and the fact that a target object may be located close to its related objects for efficient navigation. To achieve this, we train a graph neural network using the Visual Genome dataset to learn the object co-occurrence relationships and formulate the searching process as iteratively predicting the possible areas where the target object may be located. This approach is entirely zero-shot, meaning it doesn't…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning · Advanced Image and Video Retrieval Techniques
