TL;DR
This paper presents a hierarchical framework combining vision-language models and reinforcement learning to enable robots to perform complex long-horizon DLO routing tasks with high success rates.
Contribution
It introduces a novel autonomous system that integrates high-level reasoning with low-level skill execution for deformable linear object routing.
Findings
Achieves 92% success rate across diverse routing scenarios.
Utilizes vision-language models for high-level planning from language commands.
Incorporates a failure recovery mechanism for robustness in long-horizon tasks.
Abstract
Long-horizon routing tasks of deformable linear objects (DLOs), such as cables and ropes, are common in industrial assembly lines and everyday life. These tasks are particularly challenging because they require robots to manipulate DLO with long-horizon planning and reliable skill execution. Successfully completing such tasks demands adapting to their nonlinear dynamics, decomposing abstract routing goals, and generating multi-step plans composed of multiple skills, all of which require accurate high-level reasoning during execution. In this paper, we propose a fully autonomous hierarchical framework for solving challenging DLO routing tasks. Given an implicit or explicit routing goal expressed in language, our framework leverages vision-language models~(VLMs) for in-context high-level reasoning to synthesize feasible plans, which are then executed by low-level skills trained via…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
