OVAL-Grasp: Open-Vocabulary Affordance Localization for Task Oriented Grasping

Edmond Tong; Advaith Balaji; Anthony Opipari; Stanley Lewis; Zhen Zeng; Odest Chadwicke Jenkins

arXiv:2511.20841·cs.RO·November 27, 2025

OVAL-Grasp: Open-Vocabulary Affordance Localization for Task Oriented Grasping

Edmond Tong, Advaith Balaji, Anthony Opipari, Stanley Lewis, Zhen Zeng, Odest Chadwicke Jenkins

PDF

Open Access

TL;DR

OVAL-Grasp is a zero-shot, open-vocabulary method that uses large-language and vision-language models to enable robots to perform task-oriented, affordance-based grasping on novel objects by identifying and segmenting target object parts.

Contribution

It introduces a novel modular approach combining LLMs and VLMs for task-oriented grasping, outperforming existing methods in unstructured environments.

Findings

01

Achieved 95% accuracy in identifying correct object parts.

02

Successfully grasped correct actionable areas 78.3% of the time in real-world tests.

03

Maintained 80% success rate in cluttered scenes with occlusions.

Abstract

To manipulate objects in novel, unstructured environments, robots need task-oriented grasps that target object parts based on the given task. Geometry-based methods often struggle with visually defined parts, occlusions, and unseen objects. We introduce OVAL-Grasp, a zero-shot open-vocabulary approach to task-oriented, affordance based grasping that uses large-language models and vision-language models to allow a robot to grasp objects at the correct part according to a given task. Given an RGB image and a task, OVAL-Grasp identifies parts to grasp or avoid with an LLM, segments them with a VLM, and generates a 2D heatmap of actionable regions on the object. During our evaluations, we found that our method outperformed two task oriented grasping baselines on experiments with 20 household objects with 3 unique tasks for each. OVAL-Grasp successfully identifies and segments the correct…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRobot Manipulation and Learning · Motor Control and Adaptation · Social Robot Interaction and HRI