VADER: Visual Affordance Detection and Error Recovery for Multi Robot   Human Collaboration

Michael Ahn (1); Montserrat Gonzalez Arenas (1); Matthew Bennice (2),; Noah Brown (5); Christine Chan (1); Byron David (1); Anthony Francis (4),; Gavin Gonzalez (6); Rainer Hessmer (2); Tomas Jackson (6); Nikhil J Joshi; (1); Daniel Lam (2); Tsang-Wei Edward Lee (1); Alex Luong (6); Sharath; Maddineni (1); Harsh Patel (2); Jodilyn Peralta (6); Jornell Quiambao (5),; Diego Reyes (5); Rosario M Jauregui Ruano (6); Dorsa Sadigh (1); Pannag; Sanketi (1); Leila Takayama (3); Pavel Vodenski (2); Fei Xia (1) ((1) Google; DeepMind; (2) Everyday Robots; (3) Hoku Labs; (4) Logical Robotics; (5) FS; Studio; (6) Relentless Adrenalin)

arXiv:2405.16021·cs.RO·June 3, 2024·1 cites

VADER: Visual Affordance Detection and Error Recovery for Multi Robot Human Collaboration

Michael Ahn (1), Montserrat Gonzalez Arenas (1), Matthew Bennice (2),, Noah Brown (5), Christine Chan (1), Byron David (1), Anthony Francis (4),, Gavin Gonzalez (6), Rainer Hessmer (2), Tomas Jackson (6), Nikhil J Joshi, (1), Daniel Lam (2), Tsang-Wei Edward Lee (1)

PDF

Open Access

TL;DR

VADER is a framework enabling robots to detect visual affordances and errors, and recover from failures in long-horizon tasks by seeking help from humans or other robots, improving task completion in dynamic environments.

Contribution

VADER introduces a novel plan-execute-detect framework with a help-seeking skill, integrating visual affordance detection and language model planning for error recovery in robotic tasks.

Findings

01

VADER successfully completes complex tasks by asking for help from robots.

02

VADER effectively uses visual question answering to detect affordances and errors.

03

User study shows improved task performance with VADER's help-seeking capability.

Abstract

Robots today can exploit the rich world knowledge of large language models to chain simple behavioral skills into long-horizon tasks. However, robots often get interrupted during long-horizon tasks due to primitive skill failures and dynamic environments. We propose VADER, a plan, execute, detect framework with seeking help as a new skill that enables robots to recover and complete long-horizon tasks with the help of humans or other robots. VADER leverages visual question answering (VQA) modules to detect visual affordances and recognize execution errors. It then generates prompts for a language model planner (LMP) which decides when to seek help from another robot or human to recover from errors in long-horizon task execution. We show the effectiveness of VADER with two long-horizon robotic tasks. Our pilot study showed that VADER is capable of performing complex long-horizon tasks by…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRobot Manipulation and Learning · Anomaly Detection Techniques and Applications · Industrial Vision Systems and Defect Detection