O2A: One-shot Observational learning with Action vectors
Leo Pauly, Wisdom C. Agboh, David C. Hogg, Raul Fuentes

TL;DR
O2A introduces a one-shot learning method for robotic manipulation from a single third-person demonstration, utilizing action vectors derived from a pre-trained 3D-CNN to guide reinforcement learning across domain variations.
Contribution
The paper presents the first approach for one-shot robotic learning from a single demonstration using action vectors for reward computation.
Findings
O2A outperforms baseline methods under domain shifts.
O2A achieves performance comparable to an oracle with ideal rewards.
Effective in both simulation and real robot experiments.
Abstract
We present O2A, a novel method for learning to perform robotic manipulation tasks from a single (one-shot) third-person demonstration video. To our knowledge, it is the first time this has been done for a single demonstration. The key novelty lies in pre-training a feature extractor for creating a perceptual representation for actions that we call 'action vectors'. The action vectors are extracted using a 3D-CNN model pre-trained as an action classifier on a generic action dataset. The distance between the action vectors from the observed third-person demonstration and trial robot executions is used as a reward for reinforcement learning of the demonstrated task. We report on experiments in simulation and on a real robot, with changes in viewpoint of observation, properties of the objects involved, scene background and morphology of the manipulator between the demonstration and the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
