Open-Vocabulary Part-Based Grasping

Tjeard van Oort; Dimity Miller; Will N. Browne; Nicolas Marticorena; Jesse Haviland; Niko Suenderhauf

arXiv:2406.05951·cs.RO·September 19, 2025

Open-Vocabulary Part-Based Grasping

Tjeard van Oort, Dimity Miller, Will N. Browne, Nicolas Marticorena, Jesse Haviland, Niko Suenderhauf

PDF

Open Access

TL;DR

This paper introduces AnyPart, a modular framework combining open-vocabulary detection, segmentation, and grasping to enable robots to grasp specific object parts based on natural language, achieving high success rates efficiently.

Contribution

The paper presents a novel modular approach that unifies detection, segmentation, and grasping for open-vocabulary part-based grasping without extra training.

Findings

01

Achieves 60.8% grasp success in cluttered scenes

02

Operates 60 times faster than existing methods

03

Introduces a new dataset for part-based grasping

Abstract

Many robotic tasks require grasping objects at specific object parts instead of arbitrarily, a crucial capability for interactions beyond simple pick-and-place, such as human-robot interaction, handovers, or tool use. Prior work has focused either on generic grasp prediction or task-conditioned grasping, but not on directly targeting object parts in an open-vocabulary way. We propose AnyPart, a modular framework that unifies open-vocabulary object detection, part segmentation, and 6-DoF grasp prediction to enable robots to grasp user-specified parts of arbitrary objects based on natural language prompts. We evaluate 16 model combinations, and demonstrate that the best-performing combination achieves 60.8% grasp success in cluttered real-world scenes at 60 times faster inference than existing approaches. To support this study, we introduce a new dataset for part-based grasping and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques