Learning visual servo policies via planner cloning
Ulrich Viereck, Kate Saenko, Robert Platt

TL;DR
This paper introduces Penalized Q Cloning, a behavior cloning method that learns visual servo policies by mimicking motion planners, resulting in efficient transfer from simulation to real robots with high success rates.
Contribution
The paper proposes a novel behavior cloning algorithm, Penalized Q Cloning, that improves visual servo policy learning and transferability to real robots.
Findings
PQC outperforms baselines in simulation tasks.
Policies achieve ~87% success rate on real robot.
Effective transfer of policies from simulation to real environment.
Abstract
Learning control policies for visual servoing in novel environments is an important problem. However, standard model-free policy learning methods are slow. This paper explores planner cloning: using behavior cloning to learn policies that mimic the behavior of a full-state motion planner in simulation. We propose Penalized Q Cloning (PQC), a new behavior cloning algorithm. We show that it outperforms several baselines and ablations on some challenging problems involving visual servoing in novel environments while avoiding obstacles. Finally, we demonstrate that these policies can be transferred effectively onto a real robotic platform, achieving approximately an 87% success rate both in simulation and on a real robot.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
