Language-Conditioned Affordance-Pose Detection in 3D Point Clouds

Toan Nguyen; Minh Nhat Vu; Baoru Huang; Tuan Van Vo; Vy Truong; Ngan; Le; Thieu Vo; Bac Le; Anh Nguyen

arXiv:2309.10911·cs.RO·September 21, 2023

Language-Conditioned Affordance-Pose Detection in 3D Point Clouds

Toan Nguyen, Minh Nhat Vu, Baoru Huang, Tuan Van Vo, Vy Truong, Ngan, Le, Thieu Vo, Bac Le, Anh Nguyen

PDF

Open Access

TL;DR

This paper introduces a novel language-conditioned method for joint affordance detection and pose estimation in 3D point clouds, enabling robots to recognize and manipulate objects with any affordance label in real-world scenarios.

Contribution

It proposes an open-vocabulary affordance detection and pose generation framework using a language-guided diffusion model, along with a new dataset for language-driven affordance-pose learning.

Findings

01

Effective on a wide range of open-vocabulary affordances

02

Outperforms baseline methods significantly

03

Demonstrates practical utility in robotic applications

Abstract

Affordance detection and pose estimation are of great importance in many robotic applications. Their combination helps the robot gain an enhanced manipulation capability, in which the generated pose can facilitate the corresponding affordance task. Previous methods for affodance-pose joint learning are limited to a predefined set of affordances, thus limiting the adaptability of robots in real-world environments. In this paper, we propose a new method for language-conditioned affordance-pose joint learning in 3D point clouds. Given a 3D point cloud object, our method detects the affordance region and generates appropriate 6-DoF poses for any unconstrained affordance label. Our method consists of an open-vocabulary affordance detection branch and a language-guided diffusion model that generates 6-DoF poses based on the affordance text. We also introduce a new high-quality dataset for the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRobot Manipulation and Learning · Robotic Mechanisms and Dynamics · Human Pose and Action Recognition