ReorientDiff: Diffusion Model based Reorientation for Object   Manipulation

Utkarsh A. Mishra; Yongxin Chen

arXiv:2303.12700·cs.RO·September 18, 2023·1 cites

ReorientDiff: Diffusion Model based Reorientation for Object Manipulation

Utkarsh A. Mishra, Yongxin Chen

PDF

Open Access

TL;DR

ReorientDiff is a diffusion model-based approach that plans intermediate object reorientation poses using visual and language inputs, achieving high success in simulation for robotic manipulation tasks.

Contribution

This paper introduces ReorientDiff, a novel diffusion model-based method for object reorientation that integrates visual and language cues for improved planning.

Findings

01

Achieved 95.2% success rate in simulation with YCB objects.

02

Effectively conditions on scene and goal language prompts.

03

Demonstrates potential for generalizable object manipulation.

Abstract

The ability to manipulate objects in a desired configurations is a fundamental requirement for robots to complete various practical applications. While certain goals can be achieved by picking and placing the objects of interest directly, object reorientation is needed for precise placement in most of the tasks. In such scenarios, the object must be reoriented and re-positioned into intermediate poses that facilitate accurate placement at the target pose. To this end, we propose a reorientation planning method, ReorientDiff, that utilizes a diffusion model-based approach. The proposed method employs both visual inputs from the scene, and goal-specific language prompts to plan intermediate reorientation poses. Specifically, the scene and language-task information are mapped into a joint scene-task representation feature space, which is subsequently leveraged to condition the diffusion…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRobot Manipulation and Learning · Human Pose and Action Recognition · Multimodal Machine Learning Applications

MethodsDiffusion