Open-Vocabulary Affordance Detection using Knowledge Distillation and   Text-Point Correlation

Tuan Van Vo; Minh Nhat Vu; Baoru Huang; Toan Nguyen; Ngan Le; Thieu; Vo; Anh Nguyen

arXiv:2309.10932·cs.RO·September 21, 2023

Open-Vocabulary Affordance Detection using Knowledge Distillation and Text-Point Correlation

Tuan Van Vo, Minh Nhat Vu, Baoru Huang, Toan Nguyen, Ngan Le, Thieu, Vo, Anh Nguyen

PDF

Open Access 1 Repo

TL;DR

This paper presents an open-vocabulary affordance detection method for 3D point clouds that uses knowledge distillation and text-point correlation, improving semantic understanding and enabling real-time robotic applications.

Contribution

It introduces a novel approach combining knowledge distillation and text-point correlation for open-vocabulary affordance detection in 3D data, surpassing previous methods.

Findings

01

Achieves 7.96% higher mIOU score over baselines.

02

Outperforms previous methods in accuracy and generalization.

03

Supports real-time inference for robotic manipulation.

Abstract

Affordance detection presents intricate challenges and has a wide range of robotic applications. Previous works have faced limitations such as the complexities of 3D object shapes, the wide range of potential affordances on real-world objects, and the lack of open-vocabulary support for affordance understanding. In this paper, we introduce a new open-vocabulary affordance detection method in 3D point clouds, leveraging knowledge distillation and text-point correlation. Our approach employs pre-trained 3D models through knowledge distillation to enhance feature extraction and semantic understanding in 3D point clouds. We further introduce a new text-point correlation method to learn the semantic links between point cloud features and open-vocabulary labels. The intensive experiments show that our approach outperforms previous works and adapts to new affordance labels and unseen objects.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Fsoft-AIC/Open-Vocabulary-Affordance-Detection-using-Knowledge-Distillation-and-Text-Point-Correlation
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRobot Manipulation and Learning · Human Pose and Action Recognition · Image and Object Detection Techniques