Coarse-to-Fine Imitation Learning: Robot Manipulation from a Single Demonstration
Edward Johns

TL;DR
This paper presents a visual imitation learning method that enables robots to learn complex manipulation tasks from a single demonstration by modeling the task as a state estimation problem and using a coarse-to-fine approach.
Contribution
The authors introduce a novel approach that learns from one demonstration without prior object knowledge, using self-supervised state estimation and a coarse-to-fine trajectory model.
Findings
Successfully learned 8 diverse manipulation tasks from a single demonstration
Achieved stable and interpretable control in real-world experiments
No explicit policy learning required for complex interactions
Abstract
We introduce a simple new method for visual imitation learning, which allows a novel robot manipulation task to be learned from a single human demonstration, without requiring any prior knowledge of the object being interacted with. Our method models imitation learning as a state estimation problem, with the state defined as the end-effector's pose at the point where object interaction begins, as observed from the demonstration. By then modelling a manipulation task as a coarse, approach trajectory followed by a fine, interaction trajectory, this state estimator can be trained in a self-supervised manner, by automatically moving the end-effector's camera around the object. At test time, the end-effector moves to the estimated state through a linear path, at which point the original demonstration's end-effector velocities are simply replayed. This enables convenient acquisition of a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
