Compositional Servoing by Recombining Demonstrations
Max Argus, Abhijeet Nayak, Martin B\"uchner, Silvio Galesso, Abhinav, Valada, Thomas Brox

TL;DR
This paper introduces a novel visual servoing framework that uses demonstration graphs and recombination to improve task transfer and success rates in manipulation tasks, demonstrated through extensive simulations and real-world tests.
Contribution
It formulates visual servoing as graph traversal with demonstration recombination, enhancing robustness and multitask capabilities from limited demonstrations.
Findings
Recombining demonstrations improves task success rates.
The method effectively transfers tasks in simulation and real-world scenarios.
Graph-based approach enables multitask visual servoing.
Abstract
Learning-based manipulation policies from image inputs often show weak task transfer capabilities. In contrast, visual servoing methods allow efficient task transfer in high-precision scenarios while requiring only a few demonstrations. In this work, we present a framework that formulates the visual servoing task as graph traversal. Our method not only extends the robustness of visual servoing, but also enables multitask capability based on a few task-specific demonstrations. We construct demonstration graphs by splitting existing demonstrations and recombining them. In order to traverse the demonstration graph in the inference case, we utilize a similarity function that helps select the best demonstration for a specific task. This enables us to compute the shortest path through the graph. Ultimately, we show that recombining demonstrations leads to higher task-respective success. We…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Advanced Vision and Imaging · Multimodal Machine Learning Applications
