GauTOAO: Gaussian-based Task-Oriented Affordance of Objects

Jiawen Wang; Dingsheng Luo

arXiv:2409.11941·cs.RO·September 19, 2024

GauTOAO: Gaussian-based Task-Oriented Affordance of Objects

Jiawen Wang, Dingsheng Luo

PDF

Open Access

TL;DR

GauTOAO is a Gaussian-based framework that enables robots to understand task-specific object affordances in real-time using vision-language models, improving manipulation accuracy.

Contribution

The paper introduces a novel zero-shot method combining vision-language models and Gaussian distributions for precise task-oriented object affordance detection.

Findings

01

Enhanced accuracy in affordance region prediction

02

Effective generalization across multiple tasks

03

Improved robot manipulation performance

Abstract

When your robot grasps an object using dexterous hands or grippers, it should understand the Task-Oriented Affordances of the Object(TOAO), as different tasks often require attention to specific parts of the object. To address this challenge, we propose GauTOAO, a Gaussian-based framework for Task-Oriented Affordance of Objects, which leverages vision-language models in a zero-shot manner to predict affordance-relevant regions of an object, given a natural language query. Our approach introduces a new paradigm: "static camera, moving object," allowing the robot to better observe and understand the object in hand during manipulation. GauTOAO addresses the limitations of existing methods, which often lack effective spatial grouping, by extracting a comprehensive 3D object mask using DINO features. This mask is then used to conditionally query gaussians, producing a refined semantic…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Simulation Techniques and Applications · Computer Graphics and Visualization Techniques