Interactive Robotic Grasping with Attribute-Guided Disambiguation
Yang Yang, Xibai Lou, Changhyun Choi

TL;DR
This paper presents an interactive robotic grasping system that uses attribute-guided disambiguation to resolve ambiguities in natural language commands, achieving high accuracy in real-world experiments.
Contribution
It introduces an attribute-guided POMDP framework for real-time disambiguation in robotic grasping using natural language and visual attributes.
Findings
Achieves 91.43% selection accuracy in real robot experiments.
Outperforms baseline methods significantly.
Effectively resolves ambiguities via dialogue in grasping tasks.
Abstract
Interactive robotic grasping using natural language is one of the most fundamental tasks in human-robot interaction. However, language can be a source of ambiguity, particularly when there are ambiguous visual or linguistic contents. This paper investigates the use of object attributes in disambiguation and develops an interactive grasping system capable of effectively resolving ambiguities via dialogues. Our approach first predicts target scores and attribute scores through vision-and-language grounding. To handle ambiguous objects and commands, we propose an attribute-guided formulation of the partially observable Markov decision process (Attr-POMDP) for disambiguation. The Attr-POMDP utilizes target and attribute scores as the observation model to calculate the expected return of an attribute-based (e.g., "what is the color of the target, red or green?") or a pointing-based (e.g.,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
