Which objects help me to act effectively? Reasoning about   physically-grounded affordances

Anne Kemmeren; Gertjan Burghouts; Michael van Bekkum; Wouter Meijer,; Jelle van Mil

arXiv:2407.13811·cs.CV·July 22, 2024

Which objects help me to act effectively? Reasoning about physically-grounded affordances

Anne Kemmeren, Gertjan Burghouts, Michael van Bekkum, Wouter Meijer,, Jelle van Mil

PDF

Open Access

TL;DR

This paper presents a method combining large language models and vision-language models to detect affordances of objects in open-world environments, considering physical properties and robot embodiment for effective interaction.

Contribution

It introduces a novel approach that grounds affordance detection in physical properties and robot embodiment using LLMs and VLMs, enabling open-vocabulary and context-aware object interaction.

Findings

01

Method successfully identifies useful objects among distractors.

02

Finetuning VLMs enhances physical property understanding.

03

Grounding in physical properties improves affordance detection accuracy.

Abstract

For effective interactions with the open world, robots should understand how interactions with known and novel objects help them towards their goal. A key aspect of this understanding lies in detecting an object's affordances, which represent the potential effects that can be achieved by manipulating the object in various ways. Our approach leverages a dialogue of large language models (LLMs) and vision-language models (VLMs) to achieve open-world affordance detection. Given open-vocabulary descriptions of intended actions and effects, the useful objects in the environment are found. By grounding our system in the physical world, we account for the robot's embodiment and the intrinsic properties of the objects it encounters. In our experiments, we have shown that our method produces tailored outputs based on different embodiments or intended effects. The method was able to select a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsEthics and Social Impacts of AI · Multi-Agent Systems and Negotiation · Epistemology, Ethics, and Metaphysics

MethodsSparse Evolutionary Training