On the Guaranteed Almost Equivalence between Imitation Learning from Observation and Demonstration
Zhihao Cheng, Liu Liu, Aishan Liu, Hao Sun, Meng Fang, Dacheng Tao

TL;DR
This paper proves that imitation learning from observation (LfO) is nearly equivalent to imitation learning from demonstration (LfD) in deterministic and bounded-randomness robot environments, supported by theoretical analysis and experiments.
Contribution
It establishes the theoretical and empirical equivalence of LfO and LfD in practical robot environments with bounded randomness.
Findings
LfO achieves comparable performance to LfD in robot tasks.
Inverse dynamics disagreement between LfO and LfD approaches zero in deterministic environments.
The optimizing targets for LfD and LfO remain nearly the same under bounded randomness.
Abstract
Imitation learning from observation (LfO) is more preferable than imitation learning from demonstration (LfD) due to the nonnecessity of expert actions when reconstructing the expert policy from the expert data. However, previous studies imply that the performance of LfO is inferior to LfD by a tremendous gap, which makes it challenging to employ LfO in practice. By contrast, this paper proves that LfO is almost equivalent to LfD in the deterministic robot environment, and more generally even in the robot environment with bounded randomness. In the deterministic robot environment, from the perspective of the control theory, we show that the inverse dynamics disagreement between LfO and LfD approaches zero, meaning that LfO is almost equivalent to LfD. To further relax the deterministic constraint and better adapt to the practical environment, we consider bounded randomness in the robot…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobot Manipulation and Learning · Reinforcement Learning in Robotics · Robotics and Sensor-Based Localization
