Spatial Reasoning via Deep Vision Models for Robotic Sequential   Manipulation

Hongyou Zhou; Ingmar Schubert; Marc Toussaint; Ozgur S. Oguz

arXiv:2306.17053·cs.RO·August 2, 2023

Spatial Reasoning via Deep Vision Models for Robotic Sequential Manipulation

Hongyou Zhou, Ingmar Schubert, Marc Toussaint, Ozgur S. Oguz

PDF

Open Access

TL;DR

This paper introduces a deep learning-based heuristic for robotic manipulation that predicts relevant objects in a scene, significantly reducing the search space in task and motion planning and improving efficiency.

Contribution

It presents a novel integration of vision transformer and ResNet models as heuristics within TAMP to handle long-horizon tasks more efficiently.

Findings

01

More efficient solution search compared to state-of-the-art TAMP.

02

Effective prediction of relevant objects for manipulation tasks.

03

Reduced computational complexity in planning.

Abstract

In this paper, we propose using deep neural architectures (i.e., vision transformers and ResNet) as heuristics for sequential decision-making in robotic manipulation problems. This formulation enables predicting the subset of objects that are relevant for completing a task. Such problems are often addressed by task and motion planning (TAMP) formulations combining symbolic reasoning and continuous motion planning. In essence, the action-object relationships are resolved for discrete, symbolic decisions that are used to solve manipulation motions (e.g., via nonlinear trajectory optimization). However, solving long-horizon tasks requires consideration of all possible action-object combinations which limits the scalability of TAMP approaches. To overcome this combinatorial complexity, we introduce a visual perception module integrated with a TAMP-solver. Given a task and an initial image…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRobotic Path Planning Algorithms · Multimodal Machine Learning Applications · Robot Manipulation and Learning