Semantically Grounded Object Matching for Robust Robotic Scene   Rearrangement

Walter Goodwin; Sagar Vaze; Ioannis Havoutis; Ingmar Posner

arXiv:2111.07975·cs.RO·November 16, 2021

Semantically Grounded Object Matching for Robust Robotic Scene Rearrangement

Walter Goodwin, Sagar Vaze, Ioannis Havoutis, Ingmar Posner

PDF

Open Access 1 Repo

TL;DR

This paper introduces a novel object matching method using a large pre-trained vision-language model to improve robustness in robotic scene rearrangement, especially when source and goal images differ in object instances.

Contribution

The work presents a new cross-instance object matching approach leveraging semantics and visual features, overcoming limitations of previous methods that required identical object instances.

Findings

01

Significantly improved matching performance in cross-instance scenarios

02

Enables robot manipulation from goal images with no shared object instances

03

Demonstrates robustness to increased visual scene shifts

Abstract

Object rearrangement has recently emerged as a key competency in robot manipulation, with practical solutions generally involving object detection, recognition, grasping and high-level planning. Goal-images describing a desired scene configuration are a promising and increasingly used mode of instruction. A key outstanding challenge is the accurate inference of matches between objects in front of a robot, and those seen in a provided goal image, where recent works have struggled in the absence of object-specific training data. In this work, we explore the deterioration of existing methods' ability to infer matches between objects as the visual shift between observed and goal scenes increases. We find that a fundamental limitation of the current setting is that source and target images must contain the same $instance$ of every object, which restricts practical deployment. We…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

applied-ai-lab/object_matching
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Advanced Image and Video Retrieval Techniques · Domain Adaptation and Few-Shot Learning