Replanning Human-Robot Collaborative Tasks with Vision-Language Models via Semantic and Physical Dual-Correction
Taichi Kato, Takuya Kiyokawa, Namiko Saito, and Kensuke Harada

TL;DR
This paper introduces a dual-correction framework for human-robot collaboration that enhances vision-language model reasoning with internal and external checks, improving task success and robustness in assembly tasks.
Contribution
It presents a novel dual-correction mechanism integrating internal logical verification and external failure rectification to improve VLM-based HRC performance.
Findings
Improved success rate in simulation studies.
Effective real-world replanning in assembly tasks.
Enhanced robustness against physical failures.
Abstract
Human-Robot Collaboration (HRC) plays an important role in assembly tasks by enabling robots to plan and adjust their motions based on interactive, real-time human instructions. However, such instructions are often linguistically ambiguous and underspecified, making it difficult to generate physically feasible and cooperative robot behaviors. To address this challenge, many studies have applied Vision-Language Models (VLMs) to interpret high-level instructions and generate corresponding actions. Nevertheless, VLM-based approaches still suffer from hallucinated reasoning and an inability to anticipate physical execution failures. To address these challenges, we propose an HRC framework that augments a VLM-based reasoning with a dual-correction mechanism: an internal correction model that verifies logical consistency and task feasibility prior to action execution, and an external…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobot Manipulation and Learning · Social Robot Interaction and HRI · Multimodal Machine Learning Applications
